[jdom-interest] JDOM Performance

Dennis Sosnoski dms at sosnoski.com
Fri Jun 6 11:00:50 PDT 2003


Mike Brenner wrote:

>Kevin wrote:
>  
>
>>        http://www.sosnoski.com/opensrc/xmlbench/index.html
>>    
>>
>After it is updated for 2003, and since JDOM is built upon a parser, 
>I guess we would not be concerned about the DOM and SAX numbers,
>because they are parsers, not trees built upon parsers like JDOM is.
>
DOM is a tree representation (THE original tree representation for XML, 
in fact), like JDOM.

>Since EXML is similar to JDOM (built upon a parser),
>and because it was faster than JDOM at the time of the posted benchmark,
>perhaps that is the benchmark to compete against for performance?
>
EXML gave very good performance for very small files, but poor 
performance for larger files. I think this is due to the actual parser 
setup and such - EXML uses it's own custom parser, while most of the 
other libraries I tested build on a standard parser. EXML seems 
optimized for SOAP message handling. I don't really recommend it to 
people for other applications, unless they're looking for a small 
footprint library for use in an applet or such.

I've actually dropped EXML from more recent versions of the test. Both 
EXML and XPP (also shown in those test results) had some limitations on 
the XML they could handle, which is not really appropriate for a 
document model - the whole point of using a document model is that you 
can work with any XML document. The speed advantage of EXML for small 
files was eliminated with a faster parser (Piccolo), anyway.

BTW, I put a lot less weight on the tree walking and modification 
performance test results than on the times for going to and from text 
documents. In my experience most applications don't go through a 
document in memory repeatedly. If they do, and performance is a concern, 
they're better off extracting the information they need into a more 
convenient data structure.

>But, to avoid the memory leakage in StringBuffer, it's not just JDOM that has to 
>allocate many temporary objects. Since the JDOM solution posted on this list
>was to replace 
>	
>	return myStringBuffer.toString();
>
>with 
>
>	return new String(myStringBuffer.toString());
>
>we ourselves now have to allocate "too many" temporary objects also.
>
I haven't looked, but if this is what JDOM is doing when building a 
document from a parse it's probably a poor choice for performance. The 
key issue is that the characters() method of the handler *may* be called 
more than once for a single character data item, but about 99.99 percent 
of the time *won't* be. The most efficient approach to take is just to 
create a String directly on the first call, rather than use a 
StringBuffer. If you get called again, you just append to the existing 
String to create a new one. I think this is what dom4j does.

  - Dennis




More information about the jdom-interest mailing list