SV: SV: [jdom-interest] One less TODO item
pernorrman at telia.com
Tue Oct 7 10:15:46 PDT 2003
What, do you append '/' before loading the resource?
In principle, http://www.cafeconleche.org is not the same
resource as http://www.cafeconleche.org/. It probably is for
the overwhelming majority of cases, but I vaguley remember a
case where wasn't (can't come up with the details).
I looked again at the RFC 2396, and I'm now leaning on that the correct
resolution of DTD/xyz to the base URI http://www.cafeconleche.org
is indeed http://www.cafeconleche.orgDTD/xyz. Obviously, this is not
the desired result, but there is a concise algorithm in the spec
that yields this result!. Perhaps this quote from section 5.1.4 of said
document encapsulates the problem:
It is the responsibility of the distributor(s) of a document
containing relative URI to ensure that the base URI for that document
can be established. It must be emphasized that relative URI cannot
be used reliably in situations where the document's base URI is not
This *is* a murky area!
> -----Ursprungligt meddelande-----
> Från: Elliotte Rusty Harold [mailto:elharo at metalab.unc.edu]
> Skickat: den 7 oktober 2003 18:35
> Till: Per Norrman
> Kopia: jdom-interest at jdom.org
> Ämne: Re: SV: [jdom-interest] One less TODO item
> >I thought it was the SAXParsers (or XMLReader) that resolved the
> >relative URI. If I supply an EntityResolver to either crimson or
> >xerces, the system id is already resolved when the callback is made.
> >How do you work around that in XOM? Or does it have its own parser?
> I look at the URLs that are fed in and if they don't have a path
> component, I add a / at the end. Really simple.
> It's a hack, I admit, and it only works for URLs that don't have path
> components, but it does help XOM work with a lot of URLs it would
> otherwise fail on.
> I reported this bug in Xerces some time ago (or at least I thought I
> did. Can't seem to find it in Bugzilla at the moment). However, it's
> still present in 2.5. It's one of the few bugs left in Xerces that
> affects XOM. This compares very well to other parsers, most of which
> have dozens of bugs the XOM unit tests expose.
> OK, I found the bug. It's
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=18345 (God, I hate
> Bugzilla.) Hmm, look like they claim it's fixed in 2.5 but I could
> swear I'm still seeing it. Possibly I'm using an older parser? I'll
> look into this further. If Xerrces has indeed fixed this, then all
> JDOM has to do is ship the latest Xerces.
> Oh, I bet I know what's going on. I think I'm loading the older
> Xerces bundled with Java 1.4.2 rather than the bug fixed version.
> Hmm, not that's not it. OK, I've got it. They've instituted something
> equivalent to the same workaround I used. In other words, they can
> handle http://www.cafeconleche.org but not http://www.ibiblio.org/xml
> and this can be verified with Xerces's own sax.Counter program. I'll
> reopen the bug.
> Elliotte Rusty Harold
> elharo at metalab.unc.edu
> Processing XML with Java (Addison-Wesley, 2002)
More information about the jdom-interest