[jdom-interest] JDOM parser reuse memory problem

Thu Nov 10 10:51:41 PST 2011

Hi Randall, Michael.

It's an interesting observation... and I can see the implications. I would
like to take a closer look at at, but that may take a little while.

I filed https://github.com/hunterhacker/jdom/issues/52

'Off the cuff' I can think of one work-around and a few solutions (in
addition to what Michael has suggested)

1. immediately after parsing your real document you then parse a
dummy/small/inmemory document (even invalid - and catch the exception).
2. Currently when you do-no reuse the parser, it goes back to 'first
principals' and queries JAXP, etc. to find a parser instance Instead it
could 'cache' the parser 'source' after the first time, and then just
create a new instance, instead of doing all the class-based lookups... JAXP
and other data sources are not going to change mid-way through the
JVM/ClassLoader lifetime.... That way you could abandon parser re-use, but
the cost of new parser instances would be much reduced....
3. make the SAXHandler 'cleanable' and 'clean' it in the finally block
4. to set the content handler for the *next* parse at the end of the
*current* parse....

I would be reluctant to put out another 1.x build of JDOM until there's
more than just this issue to fix, and, hopefully, there are no other issues
to fix in the 1.x stream, so I would not hold your breath for another 1.x
release, but, regardless, and if possible, can you:

1. give some indication of how much of an issue this is?
2. can you wait for JDOM2? (month or so...)
3. did you find any other hot-spots?

Thanks

Rolf

On Thu, 10 Nov 2011 17:57:46 +0000, Michael Kay <mike at saxonica.com> wrote:
> On 10/11/2011 17:24, Randall Theobald wrote:
>> Hi, I'm a performance analyst and found a spot where a product I'm
>> analyzing is using JDOM. We are creating new SAXBuilders on each thread
>> and
>> are ending up with a hot lock on the classloader when trying to load up
>> the
>> XMLReader. I saw that the underlying parser in SAXBuilder can be
reused,
>> thus leading to a proper pooling strategy, but I have a memory concern.
>> In
>> the case where the parser is reused, nothing is cleared from it at the
>> end
>> of the build method (so the content handler is still held, which can
>> reference lots of objects). Since SAXBuilder doesn't expose a way to
>> clear
>> anything on the reused parser, the only option is using ugly reflection
>> to
>> clear it, or to use (slightly less ugly) WeakReferences to the
>> SAXBuilders
>> in my pool so that they evenutally get cleaned up.
>>
>> Is there a reason that the content handler on 'this.parser' isn't set
to
>> null along with the local content handler being set to null in the
>> finally
>> block of the build method? If not, I'd suggest this change.
>>
>>
> I have the same problem in Saxon. When returning a parser to the pool I 
> set all the callbacks to null (ContentHandler, lexicalHandler, etc). 
> Unfortunately some XMLReader implementations don't allow the callback to

> be set to null (the specs aren't explicit on the point). One approach is

> to catch the exception, another is to set a dummy ContentHandler or 
> whatever that doesn't have any references to anything. Messy.
> 
> Michael Kay
> Saxonica
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com