[jdom-interest] Performance measurements with Saxon
mike at saxonica.com
Thu Sep 13 07:28:12 PDT 2012
O'Neil is working on some refactoring of the wrapper code at the moment,
he'll send you a copy when it's stable. We're trying to reduce
proliferation so that improvements to algorithms only need to be made once.
Generally these queries run far faster than the tree construction time.
In the table I posted, "build-time" is the time to build the model in ms
(say 177ms) and "avg" is the time to run the query in ms (0.04ms for the
simplest queries, about 30ms for the most expensive). So you are right
that if the model has to be built in order to run a single query or
transformation, the build time can be more important than the query
time. This is of course the scenario where lazy construction ought to
play a role.
(Most of the XMark queries are linear with document size assuming the
Saxon-EE optimizer is available; if I remember right only one is
quadratic. Of course with non-linear queries, the query time quickly
overtakes the build time as the document size grows.)
In this test we wanted to test our own builders, so we are building the
tree programmatically rather than just invoking the parser; we haven't
tested how this build time compares with the "native" build using the
parser. The only case for using JDOM with Saxon in preference to using
the TinyTree is where the model is built programmatically by a previous
step in the processing pipeline, so this isn't an unreasonable thing to do.
On 13/09/2012 14:19, Rolf Lear wrote:
> Hi Michael.
> I look at those results and I am really pleased that JDOM 2.x is so much
> faster than JDOM 1.x on the query time (twice as fast as JDOM 1.x).
> There were a number of areas in JDOM 2.x that I focused on, memory
> footprint, iterator performance, and parse time. It is really good to see
> that the memory and iterator improvements are reflected in your
> 'independent' tests.
> Of course, it's also instinctive to be competitive.... and, in that light,
> I have to ask:
> - is it possible you can point me to the code you are using for the test
> (especially the 'wrapper layers' so I can inspect that code, and perhaps
> have a 'second opinion' to see whether the wrapper has room for
> improvement, and also whether JDOM can accommodate the Saxon logic more
> efficiently... I am willing (eager) to spend some time ensuring that the
> combination of JDOM and Saxon is as good as possible.
> - can you give an indication of what the baseline time is for the TinyTree
> query process? The ratios are good to compare one model against the other,
> but, creating the JDOM model takes 110ms less than XOM, and if the queries
> are taking just a few ms, then it stands to reason that JDOM2 outperforms
> XOM substantially for cases where. For example, if the Query takes 5ms,
> then JDOM can query the document 22 times in the time it takes XOM to query
> it once....
> Finally, I already have a scheduled release for JDOM 2.0.4 for early
> October. If it is possible to 'link up' with your Saxon team I think it is
> worth working together so that I can have an even better combination of
> JDOM 2.x and Saxon for release 9.5 of Saxon.... would that be possible? It
> would also be great to get some feedback on the JDOM 2.x apis and whether
> the changes have made it easier (or harder) to integrate with Saxon.... a
> 'debriefing' would be nice.
> Thanks for the feedack on the performance though, it's great to see
> something independent.
> On Thu, 13 Sep 2012 08:08:01 +0100, Michael Kay <mike at saxonica.com> wrote:
>> JDOM2 is now working as an external object model for Saxon.
>> We've done some performance measurements which are summarised here:
>> These figures show that of all the external object models, JDOM2 now
>> comes second (to XOM) in the league. The Saxon driver for XOM is
>> probably the most carefully tuned of all the drivers, which may have
>> something to do with it; also, I believe that XOM added features
>> explicitly for Saxon's use, to make sorting of nodes into document order
>> more efficient.
>> A more detailed breakdown of the results for JDOM1 and JDOM2 is given
>> below. The first group of results are for JDOM1, the second group for
>> JDOM2. For each query in the XMark benchmark, they show the execution
>> time in seconds running against a 1Mb source document; the driver
>> executes each query repeatedly until 1000 iterations or 30 seconds have
>> There's a consistent speed-up between JDOM and JDOM2. In the cases where
>> the speed-up is greatest, however, this is in part because of
>> improvements in the Saxon "wrapper": instead of using our own
>> general-purpose implementation of the descendant axis, we now make use
>> of Parent.getDescendants().
>> In this measurements, JDOM2 has slightly lower memory requirements but
>> slightly higher tree-building time; but I wouldn't be 100% confident
>> that either figure is consistent.
>> Our intention is to release Saxon 9.5 (when it's ready) with support for
>> both JDOM and JDOM2.
>> Michael Kay
More information about the jdom-interest