[jdom-interest] Memory usage when processing a large file

bob mcwhirter bob at werken.com
Mon Oct 8 10:03:56 PDT 2001

If a single-pass through the document is all that's really needed, you
may be interested in dom4j's ElementHandler interface.  You register
a handle to a particular pattern, and once that pattern matches, you
handler is immediately called, and you can process that subtree, and
then detach() it, conserving memory.

Works wonderfully with data that looks like:


Where some type of element repeats a lot.  Process each <child> subtree, and
detail from the root before parsing the next <child> subtree.  Using Jaxen,
you can still do XPath queries over the children sub-trees.  


On Mon, 8 Oct 2001, Jon Baer wrote:

> I asked the same exact question a long time back and never really got a response but
> from what I was told eventually was that if I was parsing and building a DOM
> structure of anykind that was something like 7MB that Id be better off using a
> database and skip the whole XML process, which got me going on a seperate project
> that basically uses a database (HypersonicSQL) as almost a scratch disk for building
> large documents that needed later direct manipulation (via XPath), something more or
> else that would take XPath and turn into a temporary SQL solution.  I have not gotten
> far with it but for this chatbot application Im writing it's going to eventually need
> to done.
> I'd be interested in hearing any other solutions out there.
> - Jon
> Benjamin Kopic wrote:
> > Hi
> >
> > We have an application that processes a data feed and loads it into a
> > database. It builds a JDom Document using SAXBuilder and Xerces, and then it
> > uses Jaxen XPath to retrieve data needed.
> >
> > The problem is that when we parse a 7MB feed the memory usage by Java jumps
> > to 110MB. Has anyone else used to process relatively large data feeds with
> > JDom?
> >
> > Best regards
> >
> > Benjamin Kopic
> >
