[jdom-interest] Entity Resolver Cache/Catalog

Rolf jdom at tuis.net
Wed Aug 31 22:54:52 PDT 2011


Hi Paul.

Interesting read. Thanks.

I have been reading up on your alternatives, and I have been trying to 
think of how they can be applied to JDOM. The problem I see is that 
properties are xerces specific.

For the moment I have done a few things...
1. converted the XML test-cases to use local files for resources, not 
the web.
2. I have investigated your code, and code options. I have read up on 
OASIS, xcatalog, etc.
3. I have continued playing with a 'CachingEntityResolver' that I was 
playing with on the weekend.

I was thinking that if I come up with a reliable cachingEntityResolver 
it could be put in the contrib section, and just added to a SAX/DOM Builder.

Can you think of a good way to make the process more generic, but still 
easy?

Rolf

On 29/08/2011 11:15 AM, Paul Libbrecht wrote:
>
> Le 29 août 2011 à 16:58, Rolf Lear a écrit :
>
>> This is further compounded by there being some restrictions on some
>> documents too, like the w3.org 'ban' on default Java user-agents:
>> http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/
>>
>> My experimentation indicates that w3.org has put a blanket 'tarpit' of 30
>> seconds on any connection, regardless of what User-agent you use. This is
>> 'significant'.
>
> Definitely, W3C wants you to stop reference DTDs by their URL-URIs.
> Well... it wants the parsers to stop keep parsing them.
>
>> Typical solutions to this problem are things like OASIS catalogs, etc. but
>> that feels heavy-weight... or, is it?
>
> I believe it was not very hard to configure the java-shipped Xerces with catalogs.
> And I would encourage the JDOM code to encourage this  by showing good practice.
>
> Here's what I used before SAXparsing:
>
>>             SAXParserFactory factory = SAXParserFactory.newInstance();
>>              System.setProperty("com.sun.org.apache.xerces.xni.parser.XMLParserConfiguration",
>>                      "com.sun.org.apache.xerces.parsers.XMLGrammarCachingConfiguration");
>>              SAXParser parser = factory.newSAXParser();
>>
>>              XMLCatalogResolver resolver = new XMLCatalogResolver();
>>              resolver.setPreferPublic(true);
>>              resolver.setCatalogList(new String[]{this.getClass().getResource("xmlCatalog.xml").toExternalForm()});
>>              handler = new EventDeserializerSAXHandler(resolver);
>>              if(LOG.isDebugEnabled()) LOG.debug("Starting parser.");
>>              parser.parse(inputStream, handler);
>
> Caching, however, is for free with a single system-property (within the vm lifecycle) if I remember well.
>
> It would be cool to have SAXBuilder.setCatalog to make JDOM a good citizen!
> (or even better: SAXBuilder.addCatalogEntry(public, URL) with a javadoc example where the URL is using class.getResource().
>
> paul
> also often developing in train ;-)
>



More information about the jdom-interest mailing list