[jdom-interest] Parsing a MODS-document with validation fails
thomas.scheffler at uni-jena.de
Wed Aug 10 00:37:19 PDT 2011
right after sending a mail that I did not receive this mail, I got it ;-)
> I've put together a test case for this. See attached files. The XML and
> XSD files go in junit-test/resources
> The TestSAXComplicatedSchema.java goes in
> Whatever fix we decide on can be run through this.... currently it just
> reproduces the problem.
> The 'bonus' is that the XML/schema/imports are much simpler than the
> MODS stuff.
I was thinking of providing a test case, too. But currently I have have
a lot on my ToDo list. Thanks for your work.
> Thomas, I've looked at your latest patch, and I think it is too
> heavy-weight... in the sense that it carries a lot of data through the
> hierarchy... two maps, a list, it all seems like too much. I struggled
> to follow some of the logic. I think there's a simpler option.
You are right, it is extra work to maintain these structures for a case
that no one hit before. One can find arguments for one or the other
solution. The amount of addition memory should be negligible but my
patch introduced a bit of work while parsing every document while you
suggested changes seems to produce more work in the rare case that:
1. an attribute is present with a namespace and the QName does not
have a prefix.
2. information on the prefix is held way up in the document hierarchy.
> I fact, when I looked more closely, the data is all available. If you
> encounter an attribute with the same qName and localname, but with a
> URI, then hunt up the Element hierarchy for a prefixed declaration of
> that namespace.
You are right with that. The data should be there in most cases of this
rare case. I found your code a bit hard to understand, especially your
do-while with those override prefix checks. If we do not use the SAX
events here, and there is a good documentation for that, and write code
our own, it should be well understandable in a few years.
Maybe I find the time in the following days to set up a benchmark on
that to compare both solutions. And maybe those differences will
complete any further discussion ;-)
Until now I was using jdom as a library since it beta stages and I was
quite impressed how quick I was able to understand the code and provide
a fix. I think not every library out there is on that quality. This is
something that had to be said.
More information about the jdom-interest