From jdom at tuis.net  Sun Oct  2 18:36:30 2011
From: jdom at tuis.net (Rolf)
Date: Sun, 02 Oct 2011 21:36:30 -0400
Subject: [jdom-interest] JDOM2 Update.
In-Reply-To: <4E76BDD7.7050505@tuis.net>
References: <4E76BDD7.7050505@tuis.net>
Message-ID: <4E89119E.5010903@tuis.net>

Hi All

Another update.

This has been a busy spell since the last update.

JDOM 1.1.2
==========

This is ready to go, and has been going through the final stages of 
publishing it. You can see the change-log by either inspecting the 
Changes file or the issues list:

https://github.com/hunterhacker/jdom/blob/jdom-1.1.2/core/CHANGES.txt
https://github.com/hunterhacker/jdom/issues?labels=backport+1.1.2+done&sort=created&direction=desc&state=closed&page=1 


In summary, there have been 14 bug fixes, and the Jar will be available 
on maven-central.

The JDOM 1.x branch will remain open for bug-fixes only.

If you want a sneak-peak of the 1.1.2 code you can download the source 
at: https://github.com/hunterhacker/jdom/zipball/jdom-1.1.2

JDOM 2
======

There are two major news items here, and an anticipated plan:
1. the jUnit testing has close to complete coverage, with only some 
unreachable code, and some quirky exception cases being missed;
2. Basic Generics changes have been made.
3. Putting together a JDOM2 plan

Unit Tests:
-----------

There are some failing tests, and some ignored tests. The failing tests 
relate two one of two things:
- there is a bug in JDOM where multiple consecutive Text content 
instances are not processed correctly... see 
https://github.com/hunterhacker/jdom/issues/31  I have ignored some 
tests, and left one failing until this gets resolved.
- The Jaxen code has a bug with respect to the ordering of Attribute and 
Namespace nodes, see http://jira.codehaus.org/browse/JAXEN-215 It means 
that for Attributes and Namespaces the XPath node set is not returned in 
Document order, and the Test cases expect that, so currently they fail. 
(I have a 'patched' version of Jaxen in my environment and the tests pass).

Generics:
---------

I have done a first pass at Generics for the code. The intention for 
this was was to do a cleanup of warnings and to get a baseline of a 
'simple' JDOM that's 'neat'. See the conclusion for how to get the code.

I have made only one significant API change which is to substantially 
extend the 'Filter' interface, and implementing classes. This has 
allowed for a completely 'clean' JDOM2 code base. The Filter API change 
is backward-compatible (unless you happen to have your own 
implementation of the Filter interface), and as a result, you should be 
able to do a drop-in replacement of the current JDOM2 code with your 
existing code (except you will have to change all the org.jdom.* imports 
to org.jdom2.*

At this point the code is as close as possible to being a 'minimum' 
JDOM2: it is JDOM with Generics, plus a minimum amount of spice on the 
Filter API to make the getContent(Filter) stuff work. See the 
'Conclusion' for how to get the code.


Planning:
---------

At this point The code is 'ripe' for ideas. The regression test harness 
is comprehensive, the code is close to 'clean', and yet it is all still 
very familiar to anyone familiar with JDOM.

There is one concern I have with the code in it's current state, and 
that is the serialization code, which is haphazard, incomplete, and 
inconsistent. I am not an expert on serialization, so I have left it 
unchanged.

If you ignore serialization issues in eclipse, there are no longer any 
warnings at all. Running 'FindBugs' identifies only two issue types, 
Serialization, and some 'inefficient new Integer() calls)'

So, given the current state of the code, what comes next?

In the short term I intend to get some builds up on to the web-site so 
that people can play with the code. In addition, there will be some 
statistics related to code coverage, and unit tests. This will give 
people an easy way to track progress, and to play with the code.

Then I intend to fix some of the more 'trivial' bugs that are still 
outstanding, like the TRIM_FULL_WHITE bugs, some Iterator problems, List 
problems, etc.

At the same time I plan on updating the wiki documentation for a bunch 
of things.

So, at the end of this week I expect to have a better idea of what the 
final result should look like.

Conclusion
----------

If you want to have a look at the code, get a feel for what it looks 
like, you can get it very easily. I have tagged the current code state 
with the tag 'jdom2-epoch'. Thus, you can reference that in github, and 
get a zipped package of the code base for that tag. Clicking on this 
link should start a download for you: 
https://github.com/hunterhacker/jdom/zipball/jdom2-epoch

If you have suggestions or ideas for what you think the results should 
look like now is the time to speak up.

Rolf

From jdom at tuis.net  Sun Oct  2 21:15:39 2011
From: jdom at tuis.net (Rolf)
Date: Mon, 03 Oct 2011 00:15:39 -0400
Subject: [jdom-interest] JDOM2 Update.
In-Reply-To: <4E89119E.5010903@tuis.net>
References: <4E76BDD7.7050505@tuis.net> <4E89119E.5010903@tuis.net>
Message-ID: <4E8936EB.9030202@tuis.net>

Hi All.

And there it is ... short-term goal 1:

http://hunterhacker.github.com/jdom/jdom2/index.html

The page with the JDOM2 metrics. You can browse the jUnit and Cobertura 
reports, and see what's happening.

The page is really 'spartan' right now and colours and style are 
'lacking', but, it has the base details.

Rolf

On 02/10/2011 9:36 PM, Rolf wrote:
> Hi All
>
> Another update.
>
> This has been a busy spell since the last update.
>
> JDOM 1.1.2
> ==========
>
> This is ready to go, and has been going through the final stages of
> publishing it. You can see the change-log by either inspecting the
> Changes file or the issues list:
>
> https://github.com/hunterhacker/jdom/blob/jdom-1.1.2/core/CHANGES.txt
> https://github.com/hunterhacker/jdom/issues?labels=backport+1.1.2+done&sort=created&direction=desc&state=closed&page=1
>
>
> In summary, there have been 14 bug fixes, and the Jar will be available
> on maven-central.
>
> The JDOM 1.x branch will remain open for bug-fixes only.
>
> If you want a sneak-peak of the 1.1.2 code you can download the source
> at: https://github.com/hunterhacker/jdom/zipball/jdom-1.1.2
>
> JDOM 2
> ======
>
> There are two major news items here, and an anticipated plan:
> 1. the jUnit testing has close to complete coverage, with only some
> unreachable code, and some quirky exception cases being missed;
> 2. Basic Generics changes have been made.
> 3. Putting together a JDOM2 plan
>
> Unit Tests:
> -----------
>
> There are some failing tests, and some ignored tests. The failing tests
> relate two one of two things:
> - there is a bug in JDOM where multiple consecutive Text content
> instances are not processed correctly... see
> https://github.com/hunterhacker/jdom/issues/31 I have ignored some
> tests, and left one failing until this gets resolved.
> - The Jaxen code has a bug with respect to the ordering of Attribute and
> Namespace nodes, see http://jira.codehaus.org/browse/JAXEN-215 It means
> that for Attributes and Namespaces the XPath node set is not returned in
> Document order, and the Test cases expect that, so currently they fail.
> (I have a 'patched' version of Jaxen in my environment and the tests pass).
>
> Generics:
> ---------
>
> I have done a first pass at Generics for the code. The intention for
> this was was to do a cleanup of warnings and to get a baseline of a
> 'simple' JDOM that's 'neat'. See the conclusion for how to get the code.
>
> I have made only one significant API change which is to substantially
> extend the 'Filter' interface, and implementing classes. This has
> allowed for a completely 'clean' JDOM2 code base. The Filter API change
> is backward-compatible (unless you happen to have your own
> implementation of the Filter interface), and as a result, you should be
> able to do a drop-in replacement of the current JDOM2 code with your
> existing code (except you will have to change all the org.jdom.* imports
> to org.jdom2.*
>
> At this point the code is as close as possible to being a 'minimum'
> JDOM2: it is JDOM with Generics, plus a minimum amount of spice on the
> Filter API to make the getContent(Filter) stuff work. See the
> 'Conclusion' for how to get the code.
>
>
> Planning:
> ---------
>
> At this point The code is 'ripe' for ideas. The regression test harness
> is comprehensive, the code is close to 'clean', and yet it is all still
> very familiar to anyone familiar with JDOM.
>
> There is one concern I have with the code in it's current state, and
> that is the serialization code, which is haphazard, incomplete, and
> inconsistent. I am not an expert on serialization, so I have left it
> unchanged.
>
> If you ignore serialization issues in eclipse, there are no longer any
> warnings at all. Running 'FindBugs' identifies only two issue types,
> Serialization, and some 'inefficient new Integer() calls)'
>
> So, given the current state of the code, what comes next?
>
> In the short term I intend to get some builds up on to the web-site so
> that people can play with the code. In addition, there will be some
> statistics related to code coverage, and unit tests. This will give
> people an easy way to track progress, and to play with the code.
>
> Then I intend to fix some of the more 'trivial' bugs that are still
> outstanding, like the TRIM_FULL_WHITE bugs, some Iterator problems, List
> problems, etc.
>
> At the same time I plan on updating the wiki documentation for a bunch
> of things.
>
> So, at the end of this week I expect to have a better idea of what the
> final result should look like.
>
> Conclusion
> ----------
>
> If you want to have a look at the code, get a feel for what it looks
> like, you can get it very easily. I have tagged the current code state
> with the tag 'jdom2-epoch'. Thus, you can reference that in github, and
> get a zipped package of the code base for that tag. Clicking on this
> link should start a download for you:
> https://github.com/hunterhacker/jdom/zipball/jdom2-epoch
>
> If you have suggestions or ideas for what you think the results should
> look like now is the time to speak up.
>
> Rolf
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com
>


From jdom at tuis.net  Thu Oct  6 07:42:19 2011
From: jdom at tuis.net (Rolf Lear)
Date: Thu, 06 Oct 2011 10:42:19 -0400
Subject: [jdom-interest] JDOM2 Update.
In-Reply-To: <4E89119E.5010903@tuis.net>
References: <4E76BDD7.7050505@tuis.net> <4E89119E.5010903@tuis.net>
Message-ID: <4e064a94e3e02f037383a57403315aae@tuis.net>


Hi All.

I have decided to fix and backport issue #2 and issue #48 as well to
1.1.2. This is going to affect the release timeline of the 1.1.2 code.

As a result, the current tag jdom-1.1.2 in github is going to be moved.
Please don't rely on it.

Issue #48 ( https://github.com/hunterhacker/jdom/issues/48 ) is only
recently found. It has a simple fix that is easy to back-port.

Issue #2 ( https://github.com/hunterhacker/jdom/issues/2 ) should always
have been back-ported, but it somehow slipped through the cracks when I was
looking for issues to backport. I think it's because it was on 'page 2' of
the issue list, and it has a lot of comments that are somewhat complicated
to get your head around.

I like the fix proposed by Brad, but even though the suggested fix is
actually simpler than the current code, it 'reverses' the way JDOM thinks
about the QName and localName values for both Element and Attribute names.
As a result it is a little more complicated to back-port and test. It is
compounded by the issue #1 (defaulted/fixed attributes in a Namespace) fix
which makes that area of code more complex.

I am trying to put together a table of what to expect from a SAX parser
when the three main configurations are used: not-namespace-aware,
namespace-aware, and namespaces-with-prefixes

JDOM does not support not-namespace-aware SAX parsers, but it should
support the other two modes. The way I see it is that issue #2 is actually
caused by JDOM messing up the assumptions on what data is provided in the
two different supported SAX modes. Further, technically SAX Parsers only
need to 'optionally' support the namespaces-with-prefixes mode, and JDOM
assumes that all parsers do. The different modes of operation set different
expectations on what values are passed in to the SAX 'startElement' event.
In essence, JDOM sets the 'namespaces' feature, but expects the
startElement() event to contain details only provided by the optional (and
not set) 'namespace-prefixes' feature.

The combination of Brad's patch plus the issue #1 fix *should* mean that
JDOM fully supports both SAX parse features "namespaces" and
"namespace-prefixes" : see
http://download.oracle.com/javase/6/docs/api/org/xml/sax/package-summary.html
although if a namespace-aware parser does not provide prefix details then
JDOM will generate 'implementation-specific' prefixes.

As a result I am also back-porting a number of the jUnit tests I have for
JDOM2 to get some sense of reliability in the code.

Expect this to delay 1.1.2 until at least next week.

Rolf

On Sun, 02 Oct 2011 21:36:30 -0400, Rolf <jdom at tuis.net> wrote:
> Hi All
> 
> Another update.
> 
> This has been a busy spell since the last update.
> 
> JDOM 1.1.2
> ==========
> 
> This is ready to go, and has been going through the final stages of 
> publishing it. You can see the change-log by either inspecting the 
> Changes file or the issues list:
> 
> https://github.com/hunterhacker/jdom/blob/jdom-1.1.2/core/CHANGES.txt
>
https://github.com/hunterhacker/jdom/issues?labels=backport+1.1.2+done&sort=created&direction=desc&state=closed&page=1
> 
> 
> 
> In summary, there have been 14 bug fixes, and the Jar will be available 
> on maven-central.
> 
> The JDOM 1.x branch will remain open for bug-fixes only.
> 
> If you want a sneak-peak of the 1.1.2 code you can download the source 
> at: https://github.com/hunterhacker/jdom/zipball/jdom-1.1.2
> 

From jdom at tuis.net  Thu Oct 13 17:27:17 2011
From: jdom at tuis.net (Rolf)
Date: Thu, 13 Oct 2011 20:27:17 -0400
Subject: [jdom-interest] JDOM2 and Performance.
Message-ID: <4E9781E5.1080609@tuis.net>

Hi all.

I have put together a 'simple' system for measuring the relative 
performance of JDOM2. The idea is that I need to know whether I am 
improving or breaking JDOM performance as the code evolves.

Currently the metric code is only useful of you compare apples to 
apples, and, in this case, it means processing a single (medium size) 
XML document on my laptop, yada-yada-yada. But, it should be useful as a 
tool to get a feel for what a code-change does.

Already I can see that I probably have an issue in the SAXHandler 
(possibly an issue in JDOM-1.1.2 actually) because 1.1.2 is 5-times 
faster in that area than JDOM2.

I have put together a results page here:

http://hunterhacker.github.com/jdom/jdom2/performance.html

It also describes what each test does. If you are interested in seeing 
the code and what it does have a look here (it is not well documented 
and it is still perhaps evolving):

https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb


Rolf

From mj-lists at expertsystems.se  Fri Oct 14 01:29:23 2011
From: mj-lists at expertsystems.se (Mattias Jiderhamn)
Date: Fri, 14 Oct 2011 10:29:23 +0200
Subject: [jdom-interest] JDOM2 and Performance.
Message-ID: <4E97F2E3.9010406@expertsystems.se>

Tip of the day: http://code.google.com/p/caliper/

</Mattias>

----- Original Message -----
Subject: Re: [jdom-interest] JDOM2 and Performance.
Date: Fri, 14 Oct 2011 10:08:36 +0200
From: Noel Grandin <noel at peralex.com>

Hi

Performance testing on the Java VM is tricky.
To avoid getting caught out by cache-hot/cache-cold and JIT vs. not-JIT 
things, it's preferrable to do something like
this in PerfTest#timeRun(Runnnable)

// warm up the caches and get the JIT going
for (int i=0; i<10; i++) {
runnable.run();
}

// give the JIT time to run, and get GC to run - GC can be stubborn 
sometimes
for (int i=0; i<3; i++) {
Thread.sleep(100);
System.gc();
}

// need 20 runs to get a decent average and standard deviation
ArithmeticMean mean = new ArithmeticMean(); // these two classes are in 
jakarata-commons-math
Variance deviation = new Variance();
for (int i=0; i<20; i++) {
long time1 = System.currentTimeNanos();
runnable.run();
long time2 = System.currentTimeNanos();
mean.increment(time2 - time1);
deviation.increment(time2 - time1);
}

System.out.println("result = " + mean.getMean() + " +- " + 
deviation.getVariance());

Regards, Noel Grandin

Rolf wrote:
 > Hi all.
 >
 > I have put together a 'simple' system for measuring the relative 
performance of JDOM2. The idea is that I need to know
 > whether I am improving or breaking JDOM performance as the code evolves.
 >
 > Currently the metric code is only useful of you compare apples to 
apples, and, in this case, it means processing a
 > single (medium size) XML document on my laptop, yada-yada-yada. But, 
it should be useful as a tool to get a feel for
 > what a code-change does.
 >
 > Already I can see that I probably have an issue in the SAXHandler 
(possibly an issue in JDOM-1.1.2 actually) because
 > 1.1.2 is 5-times faster in that area than JDOM2.
 >
 > I have put together a results page here:
 >
 > http://hunterhacker.github.com/jdom/jdom2/performance.html
 >
 > It also describes what each test does. If you are interested in 
seeing the code and what it does have a look here (it
 > is not well documented and it is still perhaps evolving):
 >
 > 
https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb
 >
 >
 > Rolf
 > _______________________________________________
 > To control your jdom-interest membership:
 > http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com
 >

Disclaimer: http://www.peralex.com/disclaimer.html


_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com

-- 

   </Mattias>


From jdom at tuis.net  Fri Oct 14 03:00:14 2011
From: jdom at tuis.net (Rolf Lear)
Date: Fri, 14 Oct 2011 06:00:14 -0400
Subject: [jdom-interest] JDOM2 and Performance.
In-Reply-To: <4E97F2E3.9010406@expertsystems.se>
References: <4E97F2E3.9010406@expertsystems.se>
Message-ID: <4E98082E.20305@tuis.net>

Fascinating.... I could lose some hours to looking in to that, but after 
a quick view, it could earn that time back in no time....

I will check it out properly... though I will probably have to come up 
with a different way of 'packaging' the code if I import something else. 
I am reluctant to do that.... and it may be overkill for what I want.

Thanks

Rolf

On 14/10/2011 4:29 AM, Mattias Jiderhamn wrote:
> Tip of the day: http://code.google.com/p/caliper/
>
> </Mattias>
>


From jdom at tuis.net  Fri Oct 14 05:38:54 2011
From: jdom at tuis.net (Rolf Lear)
Date: Fri, 14 Oct 2011 08:38:54 -0400
Subject: [jdom-interest] JDOM2 and Performance.
In-Reply-To: <4E9806DA.5020605@tuis.net>
References: <4E9781E5.1080609@tuis.net> <4E97EE04.3070307@peralex.com>
	<4E9806DA.5020605@tuis.net>
Message-ID: <845a11cf4091bbff5c2aa45545e27811@tuis.net>


just got off the train, and I've bumped up the inner iterations to 12, and
it makes no difference w/r/t the timings, but, while looking in to things I
have identified the changes to the XMLOutputter (
https://github.com/hunterhacker/jdom/commit/bf4fa33d253035edd085c5d190bd818133871742
) as being the cause for the regression in the performance between 1.1.2
and 2.x. I am going to have to figure out how the performance of the
'hamlet' XMLOutputter,output(Document) goes from 3ms in JDOM1.1.2 to 24ms
in 2.x (on my laptop). I think it may have to do with the
Element.getNamespacesIntroduced() code. I will need another train-ride to
fix that!

I am certain that a tool that 'monitors' the performance of the core JDOM
features is essential for the confidence required in any changes we make to
JDOM itself, so I am committed to making sure such a tool is available.

Right now what I have is 'just adequate', I think, but it has rough edges,
and is narrowly scoped.

The question is how much effort should we put in to the 'tool' rather than
the core code? What's the trade-off? Is there a better way of doing it?

Is anyone willing to put together a more thorough 'harness' for JDOM? I
certainly would appreciate that! Something that is:
1. easy to run so that contributors to JDOM can test their work before and
after their submissions
2. has a way to 'preserve' results in an easy fashion that makes it easy
to update a web-page
3. is more 'extensible' than what I have done so far (so that 'plugging
in' additional tests is easy....).

I considered something 'on top' of jUnit, but the 'performance' benchmark
is more important for the 'typical' usages of JDOM, whereas the jUnit tests
are more targeted at the atypical execution paths.... We need *fast* core
code, but exceptions and unusual code we don't really care about in respect
to performance.

Rolf


On Fri, 14 Oct 2011 05:54:34 -0400, Rolf Lear <jdom at tuis.net> wrote:
> Hi Noel
> 
> Thanks for that.
> 
> It comes out in the numbers, but,  for the record, I am doing something 
> very similar to that.... only the structures are slightly (very)
different.
> 
> I do a bunch of GC's, and I do one in a different thread with the 
> current thread sleeping, then I repeat the GC's until the size becomes 
> 'stable' at a change of less than 128 bytes.
>
https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb#L1R33
> 
> I do a complete once-through of the test suite to warm things up.
> Each once-through runs the code through 6 times (hmmm... I thought it 
> was 12, but that was something else I did yesterday). Each of the actual

> tests 'exercises' the code repeatedly because it's all sort of 
> loop-based code (parsing, scanning, etc.).
> 
> Anyway, the output of the 'warmup' run is always much slower than the 
> remaining 5 'real' runs, and I do the 'real' runs multiple times to 
> ensure there is some stability.
> 
> What you see in the web-page is the result of what I believe to be fully

> JIT-compiled and 'clean' and 'reliable enough' for the purposes I want.
> 
> I know that the Java VM testing is 'tricky' when it comes to 
> performance, and as such I understand that it's easy to get things 
> wrong, and I'll spend more time looking at it to ensure I'm doing the 
> reasonable thing, but, are you suggesting that the code I am running is 
> not actually getting reliable results?
> 
> The code is structured differently to what you have suggested below, 
> but, the entire 'main' loop is warmed up: 
>
https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb#L1R124
> 
> Then, the main loop is run 5 times, and I visually inspect the numbers 
> to ensure that they are consistent: 
>
https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb#L1R135
> 
> Between each 'test' I do a full GC with 'bells and whistles'
>
https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb#L1R160
> 
> It is quite obvious that the runs that come out of the 'real' loops are 
> optimized, cached, etc.
> 
> What is not clear is whether the optimizer has completely compiled out 
> some of the code. I have tried to ensure that it does not by doing some 
> sort of test on each element so that it is not completely ignored.
> now that I think about it though, maybe the 'devnull' Writer is too 
> 'light' and the optimizer may have completely skipped it entirely..... 
> and the whole XMLOUtputter code with it.... I will check.
> 
> So, I appreciate the insight, and I will play around with things to see 
> if increasing the number of warmup and actual 'real' runs changes the 
> numbers.
> 
> I'll look in to making sure that some of the code is not being optimized

> out completely.
> 
> But, my code already is doing pretty much exactly what you are 
> suggesting... (it does not calculate the deviation, but it does ignore 
> the fastest and slowest run.....).
> 
> In fact, it does more because it then repeats the exact same loops 
> multiple times to ensure the averages remain consistent over runs (as it

> happens, it essentially does 20 'runs' of the code to get the results - 
> 5 loops of 6 runs but the 6 only counts as 4 because the best and worst 
> are eliminated).
> 
> Have you got specific concerns about the code? Did you run it? Do you 
> think the results are 'wrong'?
> 
> Thanks for the insight in to the commons-math code. I'm always 
> 'discovering' more and more 'stuff' in commons code. I have some 'stuff'

> I've done at work I am trying to convince my boss (actually 
> legal&compliance) to let me use in JDOM, but it's the sort of thing that

> belongs in a 'commons' type location, not JDOM....
> 
> Rolf
> 
> On 14/10/2011 4:08 AM, Noel Grandin wrote:
>> Hi
>>
>> Performance testing on the Java VM is tricky.
>> To avoid getting caught out by cache-hot/cache-cold and JIT vs. 
>> not-JIT things, it's preferrable to do something like this in 
>> PerfTest#timeRun(Runnnable)
>>
>> // warm up the caches and get the JIT going
>> for (int i=0; i<10; i++) {
>>    runnable.run();
>> }
>>
>> // give the JIT time to run, and get GC to run - GC can be stubborn 
>> sometimes
>> for (int i=0; i<3; i++) {
>> Thread.sleep(100);
>> System.gc();
>> }
>>
>> // need 20 runs to get a decent average and standard deviation
>> ArithmeticMean mean = new ArithmeticMean(); // these two classes are 
>> in jakarata-commons-math
>> Variance deviation = new Variance();
>> for (int i=0; i<20; i++) {
>>   long time1 = System.currentTimeNanos();
>>   runnable.run();
>>   long time2 = System.currentTimeNanos();
>>   mean.increment(time2 - time1);
>>   deviation.increment(time2 - time1);
>> }
>>
>> System.out.println("result  = " + mean.getMean() + " +- " + 
>> deviation.getVariance());
>>
>> Regards, Noel Grandin
>>
>> Rolf wrote:
>>> Hi all.
>>>
>>> I have put together a 'simple' system for measuring the relative 
>>> performance of JDOM2. The idea is that I need to know whether I am 
>>> improving or breaking JDOM performance as the code evolves.
>>>
>>> Currently the metric code is only useful of you compare apples to 
>>> apples, and, in this case, it means processing a single (medium size) 
>>> XML document on my laptop, yada-yada-yada. But, it should be useful 
>>> as a tool to get a feel for what a code-change does.
>>>
>>> Already I can see that I probably have an issue in the SAXHandler 
>>> (possibly an issue in JDOM-1.1.2 actually) because 1.1.2 is 5-times 
>>> faster in that area than JDOM2.
>>>
>>> I have put together a results page here:
>>>
>>> http://hunterhacker.github.com/jdom/jdom2/performance.html
>>>
>>> It also describes what each test does. If you are interested in 
>>> seeing the code and what it does have a look here (it is not well 
>>> documented and it is still perhaps evolving):
>>>
>>>
https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb
>>>
>>>
>>>
>>>
>>> Rolf
>>> _______________________________________________
>>> To control your jdom-interest membership:
>>>
http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com
>>>
>>
>>
>>
>>
------------------------------------------------------------------------
>> Disclaimer: http://www.peralex.com/disclaimer.html
>>

From jdom at tuis.net  Sat Oct 15 15:17:33 2011
From: jdom at tuis.net (Rolf)
Date: Sat, 15 Oct 2011 18:17:33 -0400
Subject: [jdom-interest] JDOM2 and Performance.
In-Reply-To: <4E9781E5.1080609@tuis.net>
References: <4E9781E5.1080609@tuis.net>
Message-ID: <4E9A067D.6030004@tuis.net>

Hi all.

I've come close to restoring the JDOM 1.1.2 levels of performance.

When 'fixing' code in JDOM2 I cam accross a numbr of different places 
where namespace processing is performed (calculating the 'in scope' and 
the 'added' namespaces for an Element). This code was scattered in 
various places, inconsistent, and some places were buggy. I ended up 
stripping all of these places and replacing them all with the 
Content.getNamespacesInScope() concepts.

While convenient, the Content.getNamespacesInScope() methods were (much 
too) slow because they dynamically calculate the Namespaces each time 
they are called (which is fine for unstructured requirements where the 
document structure could change from one moment to the next).

I have thus re-implemented a new 'Namespace Stack' which is much faster 
than a completely dynamic calculation, and it is able to replace the 
various other 'stacks' that were removed before.

This has (mostly) 'restored' the performance of JDOM2's 'guts', I seem 
to be about 1-2% slower at the moment than JDOM 1.1.2

If you look at the numbers you will see that the 'Dump' code is still 
slow though. The Dump code dumps the document in the three main formats: 
Pretty, Raw, and Compact. This is running slow, and is probably related 
to the changes made for Issue #31.

I'm going to fix up that performance in XMLOutputter, and hopefully that 
will pull back the performance numbers on the other areas (the 1% - 2%) 
because each of those processes use the XMLOutputter in some way.
The Dump is particularly slow because it uses the more complicated 
Pretty and Compact mechanisms....).

The 'performance' page below has been updated...

Rolf

On 13/10/2011 8:27 PM, Rolf wrote:
> Hi all.
>
> I have put together a 'simple' system for measuring the relative
> performance of JDOM2. The idea is that I need to know whether I am
> improving or breaking JDOM performance as the code evolves.
>
> Currently the metric code is only useful of you compare apples to
> apples, and, in this case, it means processing a single (medium size)
> XML document on my laptop, yada-yada-yada. But, it should be useful as a
> tool to get a feel for what a code-change does.
>
> Already I can see that I probably have an issue in the SAXHandler
> (possibly an issue in JDOM-1.1.2 actually) because 1.1.2 is 5-times
> faster in that area than JDOM2.
>
> I have put together a results page here:
>
> http://hunterhacker.github.com/jdom/jdom2/performance.html
>
> It also describes what each test does. If you are interested in seeing
> the code and what it does have a look here (it is not well documented
> and it is still perhaps evolving):
>
> https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb
>
>
>
> Rolf


From jdom at tuis.net  Tue Oct 18 20:11:24 2011
From: jdom at tuis.net (Rolf)
Date: Tue, 18 Oct 2011 23:11:24 -0400
Subject: [jdom-interest] JDOM2 and Performance.
In-Reply-To: <4E9A067D.6030004@tuis.net>
References: <4E9781E5.1080609@tuis.net> <4E9A067D.6030004@tuis.net>
Message-ID: <4E9E3FDC.90306@tuis.net>

Hi Again.

Just committed a new snapshot of JDOM2 together with the JavaDocs, jUnit 
and coverage reports, and a performance update to:
http://hunterhacker.github.com/jdom/jdom2/

The performance has been mostly restored, and there are big improvements 
in the XPath processing (even though I changed nothing in that area... 
;- )  , it is all to do with more efficient Iterator implementations in 
the ContentList.

See http://hunterhacker.github.com/jdom/jdom2/performance.html

I have done the first major refactor of JDOM2 code, essentially 
rewriting the XMLOutputter code. It is much neater, consistent, and, 
should you need to, it is now completely 'extensible'.

By changing the way the code is structured, the XMLOutputter is now 
reentrant, and yet still just as fast, if not faster for some things.

I have found and fixed a lot of obscure bugs that may have been plaguing 
people even if they did not know it..., like if you have 
xml:space="preserve" embedded in your XML document, then JDOM would 
happily insist on outputting whatever content was inside that in the 
UFT-8 encoding, even if you had requested some other encoding...

This particular refactor has taken a lot of time, so I have to back off 
a little and catch up on some other things in life... back to just 'JDOM 
on the train' for a bit.

Rolf

On 15/10/2011 6:17 PM, Rolf wrote:
> Hi all.
>
> I've come close to restoring the JDOM 1.1.2 levels of performance.
>
> When 'fixing' code in JDOM2 I cam accross a numbr of different places
> where namespace processing is performed (calculating the 'in scope' and
> the 'added' namespaces for an Element). This code was scattered in
> various places, inconsistent, and some places were buggy. I ended up
> stripping all of these places and replacing them all with the
> Content.getNamespacesInScope() concepts.
>
> While convenient, the Content.getNamespacesInScope() methods were (much
> too) slow because they dynamically calculate the Namespaces each time
> they are called (which is fine for unstructured requirements where the
> document structure could change from one moment to the next).
>
> I have thus re-implemented a new 'Namespace Stack' which is much faster
> than a completely dynamic calculation, and it is able to replace the
> various other 'stacks' that were removed before.
>
> This has (mostly) 'restored' the performance of JDOM2's 'guts', I seem
> to be about 1-2% slower at the moment than JDOM 1.1.2
>
> If you look at the numbers you will see that the 'Dump' code is still
> slow though. The Dump code dumps the document in the three main formats:
> Pretty, Raw, and Compact. This is running slow, and is probably related
> to the changes made for Issue #31.
>
> I'm going to fix up that performance in XMLOutputter, and hopefully that
> will pull back the performance numbers on the other areas (the 1% - 2%)
> because each of those processes use the XMLOutputter in some way.
> The Dump is particularly slow because it uses the more complicated
> Pretty and Compact mechanisms....).
>
> The 'performance' page below has been updated...
>
> Rolf
>
> On 13/10/2011 8:27 PM, Rolf wrote:
>> Hi all.
>>
>> I have put together a 'simple' system for measuring the relative
>> performance of JDOM2. The idea is that I need to know whether I am
>> improving or breaking JDOM performance as the code evolves.
>>
>> Currently the metric code is only useful of you compare apples to
>> apples, and, in this case, it means processing a single (medium size)
>> XML document on my laptop, yada-yada-yada. But, it should be useful as a
>> tool to get a feel for what a code-change does.
>>
>> Already I can see that I probably have an issue in the SAXHandler
>> (possibly an issue in JDOM-1.1.2 actually) because 1.1.2 is 5-times
>> faster in that area than JDOM2.
>>
>> I have put together a results page here:
>>
>> http://hunterhacker.github.com/jdom/jdom2/performance.html
>>
>> It also describes what each test does. If you are interested in seeing
>> the code and what it does have a look here (it is not well documented
>> and it is still perhaps evolving):
>>
>> https://github.com/hunterhacker/jdom/commit/8b719c86913398ace8e197b6de145b33d9d300bb
>>
>>
>>
>>
>> Rolf
>
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com
>


From jhunter at servlets.com  Sun Oct 23 17:47:57 2011
From: jhunter at servlets.com (Jason Hunter)
Date: Sun, 23 Oct 2011 17:47:57 -0700
Subject: [jdom-interest] [Announce] JDOM 1.1.2 is released
Message-ID: <683DBC34-1A42-4525-BB64-CF368CE44FC1@servlets.com>

I'm happy to announce the release of JDOM 1.1.2 today.  It's a drop-in replacement for JDOM 1.1.1 with more than a dozen bugs fixed.  You can download the release here:

http://jdom.org/dist/binary/

You can see the changes here:

https://github.com/hunterhacker/jdom/blob/jdom-1.1.2/core/CHANGES.txt

It'll appear in maven-central shortly too.

Thanks to Rolf Lear for doing all the heavy lifting for this release!

-jh-


From jdom at tuis.net  Mon Oct 24 02:35:45 2011
From: jdom at tuis.net (Rolf Lear)
Date: Mon, 24 Oct 2011 05:35:45 -0400
Subject: [jdom-interest] [Announce] JDOM 1.1.2 is released
In-Reply-To: <683DBC34-1A42-4525-BB64-CF368CE44FC1@servlets.com>
References: <683DBC34-1A42-4525-BB64-CF368CE44FC1@servlets.com>
Message-ID: <4EA53171.7030705@tuis.net>

And there it is in maven-central:

http://search.maven.org/#artifactdetails%7Corg.jdom%7Cjdom%7C1.1.2%7Cjar

Rolf

On 23/10/2011 8:47 PM, Jason Hunter wrote:
> I'm happy to announce the release of JDOM 1.1.2 today.  It's a drop-in replacement for JDOM 1.1.1 with more than a dozen bugs fixed.  You can download the release here:
>
> http://jdom.org/dist/binary/
>
> You can see the changes here:
>
> https://github.com/hunterhacker/jdom/blob/jdom-1.1.2/core/CHANGES.txt
>
> It'll appear in maven-central shortly too.
>
> Thanks to Rolf Lear for doing all the heavy lifting for this release!
>
> -jh-
>
>
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com
>


From mike at saxonica.com  Mon Oct 24 05:29:22 2011
From: mike at saxonica.com (Michael Kay)
Date: Mon, 24 Oct 2011 13:29:22 +0100
Subject: [jdom-interest]  Performance: JDOM2 and Saxon
In-Reply-To: <4EA5540E.3080201@saxonica.com>
References: <4EA5540E.3080201@saxonica.com>
Message-ID: <4EA55A22.2050702@saxonica.com>

My colleague O'Neil Delpratt has been doing some performance experiments 
with JDOM1 and JDOM2. Here are the results he is getting.


Experiment: I ran a somewhat simplified test harness on the same two 
XPath expression (i.e. "//@null" and "//node()") on the XML document 
hamlet.xml

Results
Average time taken over 50 runs, excluding the first run.

JDOM1: 273.15ms
JDOM2: 92.56ms
Saxon (TinyTree treeModel): 2.8ms
Saxon (JDOM treeModel): 10.36ms
Saxon (JDOM2 treeModel): 10.82ms

The # of tree nodes:
Saxon: 12097
Standalone JDOM(-2): 19840

The difference in results was down to whitespace between elements 
represented as text nodes in JDOM(-2).

So: JDOM2 is doing a good job relative to JDOM1, but the XPath engine is 
still very slow compared to Saxon's XPath engine.

The Saxon code for accessing JDOM2 uses the JDOM node.getDescendants() 
method rather than making recursive use of getChildren() as we do with 
JDOM1, and this benefits performance in that without this change, the 
JDOM2 code ran in 12.28ms; but we're still getting slightly slower 
results from JDOM2 despite this improvement.

I believe the way the measurements were done causes the XPath expression 
to be compiled once and executed repeatedly.

The differences we are seeing from these results are:

(a) The TinyTree is very fast when processing the descendant axis 
(because the nodes are held in an array in document order)

(b) In the scenario where XPath compile time is amortized over many 
executions (the only case we've measured), the Saxon XPath engine is 
much faster than the one built in to JDOM.

(c) JDOM2 is fractionally slower than JDOM1 in its navigational APIs, 
even though its XPath engine is now three times faster.

Michael Kay
Saxonica

From jdom at tuis.net  Mon Oct 24 07:15:18 2011
From: jdom at tuis.net (Rolf Lear)
Date: Mon, 24 Oct 2011 10:15:18 -0400
Subject: [jdom-interest] Performance: JDOM2 and Saxon
In-Reply-To: <4EA55A22.2050702@saxonica.com>
References: <4EA5540E.3080201@saxonica.com> <4EA55A22.2050702@saxonica.com>
Message-ID: <3ec6e6c19f8da026efff1b706db90422@tuis.net>


Hi Michael, O'Neil

I simply have not looked in to Saxon yet, so I have no frame of reference,
and bear with me on that as it will happen at some point...

There is issue #34 https://github.com/hunterhacker/jdom/issues/34 to track
XSLTransform which I created in response to your suggestions for Saxon...
and I do keep looking at it.

My overall plan has 'always' been to:
1. build a regression test system (junit testcases).
2. build a performance regression test system (PerfTest)
3. make changes for JDOM2 with confidence.

Having built the 'PerfTest' process I've nailed down some of the
performance regressions I introduced, and followed the 'thread' of changes
in to some other areas. It's a little 'aimless', but the current 'theme' is
'performance'.

This is probably a mistake, I should be looking at 'structure' now that I
have the (restored) performance baseline... but the 'performance' thing is
always good, and I find it fun and challenging.

The code is now 'ripe' for looking at structural changes though.

Still, Saxon concerns me from a JDOM perspective because of the
dual-licensing with the 'restricted' free/open version, and the 'complete'
commercial version.

My personal feel for this sort of situation is that the solution from a
JDOM perspective is to keep the JDOM API open, and to make it possible/easy
to use Saxon, but not to include either version of Saxon as the 'default
engine'. Specifically, I don't see JDOM as being an advertising platform
for some commercial product. I know this sort of issue is
debatable/religious/etc. which is why it's important to understand that I
am willing to defer to Jason's judgment on this one. For what it's worth
the company I work for would would have to implement special protocol
handling for JDOM if it were to bundle the Saxon code.

On the other hand, I really do appreciate your taking the time to look in
to the integration of Saxon and JDOM.

I have some comments/questions/suggestions:
1. I changed the 'implementation' API of the XPath code when I worked on
the jaxen bugs/issues. The intention was to make it easier (than before) to
have other engines (like Saxon). Did this change help you with your tests?
Could it be done better?
2. Is the integration 'glue' something that can be easily put in
org.jdom2.xpath.saxon ?
3. I implemented new iterator() back-ends for ContentList which are
significantly faster than before in change 41217056 (17th Oct). Is your
test based on JDOM2 from before that? :
https://github.com/hunterhacker/jdom/commit/412170566ebdf8449b442e44f12ed8712d447a19
Those changes should bring the hamlet.getDescendants() down to about 3ms
4. The 'missing' Text nodes are significant.... I am surprised that they
are absent? What is the logic for skipping them?
5. Which leads to the question: How does the Saxon implementation fare on
the unit tests? Can you create a Saxon version of:
https://github.com/hunterhacker/jdom/blob/master/test/src/java/org/jdom2/test/cases/xpath/TestLocalJaxenXPath.java

The 'snapshot' system I have started on the github pages is not very
useful for figuring out what's in the snapshot, and naming the snapshot. I
should fix that.

But, the 'current' snapshots should have the improved iterator:
http://hunterhacker.github.com/jdom/jdom2/snapshot/jdom-2.x-SNAPSHOT.jar

It would be better though if you just pulled the latest code though
because there are a couple of other changes that would improve performance
too.

Thanks again

Rolf

On Mon, 24 Oct 2011 13:29:22 +0100, Michael Kay <mike at saxonica.com> wrote:
> My colleague O'Neil Delpratt has been doing some performance experiments

> with JDOM1 and JDOM2. Here are the results he is getting.
> 
> 
> 
> Experiment: I ran a somewhat simplified test harness on the same two 
> XPath expression (i.e. "//@null" and "//node()") on the XML document 
> hamlet.xml
> 
> Results
> Average time taken over 50 runs, excluding the first run.
> 
> JDOM1: 273.15ms
> JDOM2: 92.56ms
> Saxon (TinyTree treeModel): 2.8ms
> Saxon (JDOM treeModel): 10.36ms
> Saxon (JDOM2 treeModel): 10.82ms
> 
> The # of tree nodes:
> Saxon: 12097
> Standalone JDOM(-2): 19840
> 
> The difference in results was down to whitespace between elements 
> represented as text nodes in JDOM(-2).
> 
> So: JDOM2 is doing a good job relative to JDOM1, but the XPath engine is

> still very slow compared to Saxon's XPath engine.
> 
> The Saxon code for accessing JDOM2 uses the JDOM node.getDescendants() 
> method rather than making recursive use of getChildren() as we do with 
> JDOM1, and this benefits performance in that without this change, the 
> JDOM2 code ran in 12.28ms; but we're still getting slightly slower 
> results from JDOM2 despite this improvement.
> 
> I believe the way the measurements were done causes the XPath expression

> to be compiled once and executed repeatedly.
> 
> The differences we are seeing from these results are:
> 
> (a) The TinyTree is very fast when processing the descendant axis 
> (because the nodes are held in an array in document order)
> 
> (b) In the scenario where XPath compile time is amortized over many 
> executions (the only case we've measured), the Saxon XPath engine is 
> much faster than the one built in to JDOM.
> 
> (c) JDOM2 is fractionally slower than JDOM1 in its navigational APIs, 
> even though its XPath engine is now three times faster.
> 
> Michael Kay
> Saxonica
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com

From oneil at saxonica.com  Tue Oct 25 03:46:39 2011
From: oneil at saxonica.com (O'Neil Delpratt)
Date: Tue, 25 Oct 2011 11:46:39 +0100
Subject: [jdom-interest] Performance: JDOM2 and Saxon
In-Reply-To: <3ec6e6c19f8da026efff1b706db90422@tuis.net>
References: <4EA5540E.3080201@saxonica.com> <4EA55A22.2050702@saxonica.com>
	<3ec6e6c19f8da026efff1b706db90422@tuis.net>
Message-ID: <4EA6938F.6050205@saxonica.com>

Hi Rolf,

The intention of doing these experiments was not to suggest that we can 
integrate Saxon with JDOM as a package, we recognize that this would 
create questions around the licensing. We were primarily interested in 
the performance of JDOM2 compared to JDOM1 using both the Saxon XPath 
engine and JDOM's embedded XPath engine. We wanted to check that we can 
do JDOM2 as well as JDOM1, and that the performance we get is 
acceptable. We thought we'd let you know the results as they seem to be 
interesting in the context of the JDOM2 project.

Answer to your questions/comments:
1) I don't think we're interfacing with JDOM at that level - we don't 
attempt to make Saxon available using JDOM's APIs, only using Saxon's APIs.

2) Potentially - but I think there could be some difficulties because of 
the need to establish a Saxon Configuration. Using Saxon for individual 
XPath requests without giving Saxon any context that's reused across 
requests would probably perform badly.

3) I confirm the tests on JDOM2 were done using the build after the 17th 
October, there including the changes made to the Iterator() for 
ContentList. The tests confirm the results you had published on 
http://hunterhacker.github.com/jdom/jdom2/performance.html

4) Whitespace: not sure of the exact details here, but the general rule 
for XPath 1.0 is that all whitespace is preserved unless otherwise 
specified, whereas in XPath 2.0 it's DTD-sensitive - whitespace in 
element-only content gets removed. We could do a performance comparison 
that eliminated this potential source of differences, but I'm not sure 
we would learn much more from it.

5) I'm not sure this would be productive. Our focus is on running the 
W3C XSLT and XQuery test suites and making sure that the results when 
JDOM is used underneath match the expected results. (We've generally 
only done this for a subset of the tests, and there tend to be some 
differences in test results for different tree models, caused for 
example because some models don't label nodes as IDs or IDREFs, some 
don't expose unparsed entities, etc.)

regards,

Mike and O'Neil


On 24/10/11 15:15, Rolf Lear wrote:
> Hi Michael, O'Neil
>
> I simply have not looked in to Saxon yet, so I have no frame of reference,
> and bear with me on that as it will happen at some point...
>
> There is issue #34https://github.com/hunterhacker/jdom/issues/34  to track
> XSLTransform which I created in response to your suggestions for Saxon...
> and I do keep looking at it.
>
> My overall plan has 'always' been to:
> 1. build a regression test system (junit testcases).
> 2. build a performance regression test system (PerfTest)
> 3. make changes for JDOM2 with confidence.
>
> Having built the 'PerfTest' process I've nailed down some of the
> performance regressions I introduced, and followed the 'thread' of changes
> in to some other areas. It's a little 'aimless', but the current 'theme' is
> 'performance'.
>
> This is probably a mistake, I should be looking at 'structure' now that I
> have the (restored) performance baseline... but the 'performance' thing is
> always good, and I find it fun and challenging.
>
> The code is now 'ripe' for looking at structural changes though.
>
> Still, Saxon concerns me from a JDOM perspective because of the
> dual-licensing with the 'restricted' free/open version, and the 'complete'
> commercial version.
>
> My personal feel for this sort of situation is that the solution from a
> JDOM perspective is to keep the JDOM API open, and to make it possible/easy
> to use Saxon, but not to include either version of Saxon as the 'default
> engine'. Specifically, I don't see JDOM as being an advertising platform
> for some commercial product. I know this sort of issue is
> debatable/religious/etc. which is why it's important to understand that I
> am willing to defer to Jason's judgment on this one. For what it's worth
> the company I work for would would have to implement special protocol
> handling for JDOM if it were to bundle the Saxon code.
>
> On the other hand, I really do appreciate your taking the time to look in
> to the integration of Saxon and JDOM.
>
> I have some comments/questions/suggestions:
> 1. I changed the 'implementation' API of the XPath code when I worked on
> the jaxen bugs/issues. The intention was to make it easier (than before) to
> have other engines (like Saxon). Did this change help you with your tests?
> Could it be done better?
> 2. Is the integration 'glue' something that can be easily put in
> org.jdom2.xpath.saxon ?
> 3. I implemented new iterator() back-ends for ContentList which are
> significantly faster than before in change 41217056 (17th Oct). Is your
> test based on JDOM2 from before that? :
> https://github.com/hunterhacker/jdom/commit/412170566ebdf8449b442e44f12ed8712d447a19
> Those changes should bring the hamlet.getDescendants() down to about 3ms
> 4. The 'missing' Text nodes are significant.... I am surprised that they
> are absent? What is the logic for skipping them?
> 5. Which leads to the question: How does the Saxon implementation fare on
> the unit tests? Can you create a Saxon version of:
> https://github.com/hunterhacker/jdom/blob/master/test/src/java/org/jdom2/test/cases/xpath/TestLocalJaxenXPath.java
>
> The 'snapshot' system I have started on the github pages is not very
> useful for figuring out what's in the snapshot, and naming the snapshot. I
> should fix that.
>
> But, the 'current' snapshots should have the improved iterator:
> http://hunterhacker.github.com/jdom/jdom2/snapshot/jdom-2.x-SNAPSHOT.jar
>
> It would be better though if you just pulled the latest code though
> because there are a couple of other changes that would improve performance
> too.
>
> Thanks again
>
> Rolf
>
> On Mon, 24 Oct 2011 13:29:22 +0100, Michael Kay<mike at saxonica.com>  wrote:
>> My colleague O'Neil Delpratt has been doing some performance experiments
>> with JDOM1 and JDOM2. Here are the results he is getting.
>>
>>
>>
>> Experiment: I ran a somewhat simplified test harness on the same two
>> XPath expression (i.e. "//@null" and "//node()") on the XML document
>> hamlet.xml
>>
>> Results
>> Average time taken over 50 runs, excluding the first run.
>>
>> JDOM1: 273.15ms
>> JDOM2: 92.56ms
>> Saxon (TinyTree treeModel): 2.8ms
>> Saxon (JDOM treeModel): 10.36ms
>> Saxon (JDOM2 treeModel): 10.82ms
>>
>> The # of tree nodes:
>> Saxon: 12097
>> Standalone JDOM(-2): 19840
>>
>> The difference in results was down to whitespace between elements
>> represented as text nodes in JDOM(-2).
>>
>> So: JDOM2 is doing a good job relative to JDOM1, but the XPath engine is
>> still very slow compared to Saxon's XPath engine.
>>
>> The Saxon code for accessing JDOM2 uses the JDOM node.getDescendants()
>> method rather than making recursive use of getChildren() as we do with
>> JDOM1, and this benefits performance in that without this change, the
>> JDOM2 code ran in 12.28ms; but we're still getting slightly slower
>> results from JDOM2 despite this improvement.
>>
>> I believe the way the measurements were done causes the XPath expression
>> to be compiled once and executed repeatedly.
>>
>> The differences we are seeing from these results are:
>>
>> (a) The TinyTree is very fast when processing the descendant axis
>> (because the nodes are held in an array in document order)
>>
>> (b) In the scenario where XPath compile time is amortized over many
>> executions (the only case we've measured), the Saxon XPath engine is
>> much faster than the one built in to JDOM.
>>
>> (c) JDOM2 is fractionally slower than JDOM1 in its navigational APIs,
>> even though its XPath engine is now three times faster.
>>
>> Michael Kay
>> Saxonica
>> _______________________________________________
>> To control your jdom-interest membership:
>> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com


-- 
O'Neil Delpratt
Software Developer, Saxonica Limited
Email: oneil at saxonica.com <mailto:oneil at saxonica.com>
Tel: +44 118 946 5894
Web: http://www.saxonica.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.jdom.org/pipermail/jdom-interest/attachments/20111025/04efa826/attachment.html>

From jdom at tuis.net  Tue Oct 25 05:26:32 2011
From: jdom at tuis.net (Rolf Lear)
Date: Tue, 25 Oct 2011 08:26:32 -0400
Subject: [jdom-interest] Performance: JDOM2 and Saxon
In-Reply-To: <4EA6938F.6050205@saxonica.com>
References: <4EA5540E.3080201@saxonica.com> <4EA55A22.2050702@saxonica.com>
	<3ec6e6c19f8da026efff1b706db90422@tuis.net>
	<4EA6938F.6050205@saxonica.com>
Message-ID: <065757700bb856fa9c6421011fb8128e@tuis.net>


Excellent. I can work with that, and the feedback is appreciated.

If nothing else, seeing the numbers creates something of a 'baseline'
against which we can set expectations.

Based on your response, and since you are the first 'users' of JDOM2
speaking up (thanks) perhaps some follow-up comments:
1. the XPath area (org.jdom2.xpath.*) is expected to be revised still
(issues #42 and #45). This may impact your work.
2. Have you identified any areas of JDOM2 code which are underperforming?
You mention the "navigational API's" can you narrow that down to any
particular iterators/methods?
3. In general is JDOM2 'better' to work with than JDOM1? Is it going in
the right direction? Do you even notice it?
4. Are there any other changes that would make your life easier
(API/etc.)?
 
Thanks

Rolf


On Tue, 25 Oct 2011 11:46:39 +0100, O'Neil Delpratt <oneil at saxonica.com>
wrote:
> Hi Rolf,
> 
> The intention of doing these experiments was not to suggest that we can 
> integrate Saxon with JDOM as a package, we recognize that this would 
> create questions around the licensing. We were primarily interested in 
> the performance of JDOM2 compared to JDOM1 using both the Saxon XPath 
> engine and JDOM's embedded XPath engine. We wanted to check that we can 
> do JDOM2 as well as JDOM1, and that the performance we get is 
> acceptable. We thought we'd let you know the results as they seem to be 
> interesting in the context of the JDOM2 project.
> 
> Answer to your questions/comments:
> 1) I don't think we're interfacing with JDOM at that level - we don't 
> attempt to make Saxon available using JDOM's APIs, only using Saxon's
APIs.
> 
> 2) Potentially - but I think there could be some difficulties because of

> the need to establish a Saxon Configuration. Using Saxon for individual 
> XPath requests without giving Saxon any context that's reused across 
> requests would probably perform badly.
> 
> 3) I confirm the tests on JDOM2 were done using the build after the 17th

> October, there including the changes made to the Iterator() for 
> ContentList. The tests confirm the results you had published on 
> http://hunterhacker.github.com/jdom/jdom2/performance.html
> 
> 4) Whitespace: not sure of the exact details here, but the general rule 
> for XPath 1.0 is that all whitespace is preserved unless otherwise 
> specified, whereas in XPath 2.0 it's DTD-sensitive - whitespace in 
> element-only content gets removed. We could do a performance comparison 
> that eliminated this potential source of differences, but I'm not sure 
> we would learn much more from it.
> 
> 5) I'm not sure this would be productive. Our focus is on running the 
> W3C XSLT and XQuery test suites and making sure that the results when 
> JDOM is used underneath match the expected results. (We've generally 
> only done this for a subset of the tests, and there tend to be some 
> differences in test results for different tree models, caused for 
> example because some models don't label nodes as IDs or IDREFs, some 
> don't expose unparsed entities, etc.)
> 
> regards,
> 
> Mike and O'Neil
> 
> 
> On 24/10/11 15:15, Rolf Lear wrote:
>> Hi Michael, O'Neil
>>
>> I simply have not looked in to Saxon yet, so I have no frame of
>> reference,
>> and bear with me on that as it will happen at some point...
>>
>> There is issue #34https://github.com/hunterhacker/jdom/issues/34  to
>> track
>> XSLTransform which I created in response to your suggestions for
Saxon...
>> and I do keep looking at it.
>>
>> My overall plan has 'always' been to:
>> 1. build a regression test system (junit testcases).
>> 2. build a performance regression test system (PerfTest)
>> 3. make changes for JDOM2 with confidence.
>>
>> Having built the 'PerfTest' process I've nailed down some of the
>> performance regressions I introduced, and followed the 'thread' of
>> changes
>> in to some other areas. It's a little 'aimless', but the current
'theme'
>> is
>> 'performance'.
>>
>> This is probably a mistake, I should be looking at 'structure' now that
I
>> have the (restored) performance baseline... but the 'performance' thing
>> is
>> always good, and I find it fun and challenging.
>>
>> The code is now 'ripe' for looking at structural changes though.
>>
>> Still, Saxon concerns me from a JDOM perspective because of the
>> dual-licensing with the 'restricted' free/open version, and the
>> 'complete'
>> commercial version.
>>
>> My personal feel for this sort of situation is that the solution from a
>> JDOM perspective is to keep the JDOM API open, and to make it
>> possible/easy
>> to use Saxon, but not to include either version of Saxon as the
'default
>> engine'. Specifically, I don't see JDOM as being an advertising
platform
>> for some commercial product. I know this sort of issue is
>> debatable/religious/etc. which is why it's important to understand that
I
>> am willing to defer to Jason's judgment on this one. For what it's
worth
>> the company I work for would would have to implement special protocol
>> handling for JDOM if it were to bundle the Saxon code.
>>
>> On the other hand, I really do appreciate your taking the time to look
in
>> to the integration of Saxon and JDOM.
>>
>> I have some comments/questions/suggestions:
>> 1. I changed the 'implementation' API of the XPath code when I worked
on
>> the jaxen bugs/issues. The intention was to make it easier (than
before)
>> to
>> have other engines (like Saxon). Did this change help you with your
>> tests?
>> Could it be done better?
>> 2. Is the integration 'glue' something that can be easily put in
>> org.jdom2.xpath.saxon ?
>> 3. I implemented new iterator() back-ends for ContentList which are
>> significantly faster than before in change 41217056 (17th Oct). Is your
>> test based on JDOM2 from before that? :
>>
https://github.com/hunterhacker/jdom/commit/412170566ebdf8449b442e44f12ed8712d447a19
>> Those changes should bring the hamlet.getDescendants() down to about
3ms
>> 4. The 'missing' Text nodes are significant.... I am surprised that
they
>> are absent? What is the logic for skipping them?
>> 5. Which leads to the question: How does the Saxon implementation fare
on
>> the unit tests? Can you create a Saxon version of:
>>
https://github.com/hunterhacker/jdom/blob/master/test/src/java/org/jdom2/test/cases/xpath/TestLocalJaxenXPath.java
>>
>> The 'snapshot' system I have started on the github pages is not very
>> useful for figuring out what's in the snapshot, and naming the
snapshot.
>> I
>> should fix that.
>>
>> But, the 'current' snapshots should have the improved iterator:
>>
http://hunterhacker.github.com/jdom/jdom2/snapshot/jdom-2.x-SNAPSHOT.jar
>>
>> It would be better though if you just pulled the latest code though
>> because there are a couple of other changes that would improve
>> performance
>> too.
>>
>> Thanks again
>>
>> Rolf
>>
>> On Mon, 24 Oct 2011 13:29:22 +0100, Michael Kay<mike at saxonica.com> 
>> wrote:
>>> My colleague O'Neil Delpratt has been doing some performance
experiments
>>> with JDOM1 and JDOM2. Here are the results he is getting.
>>>
>>>
>>>
>>> Experiment: I ran a somewhat simplified test harness on the same two
>>> XPath expression (i.e. "//@null" and "//node()") on the XML document
>>> hamlet.xml
>>>
>>> Results
>>> Average time taken over 50 runs, excluding the first run.
>>>
>>> JDOM1: 273.15ms
>>> JDOM2: 92.56ms
>>> Saxon (TinyTree treeModel): 2.8ms
>>> Saxon (JDOM treeModel): 10.36ms
>>> Saxon (JDOM2 treeModel): 10.82ms
>>>
>>> The # of tree nodes:
>>> Saxon: 12097
>>> Standalone JDOM(-2): 19840
>>>
>>> The difference in results was down to whitespace between elements
>>> represented as text nodes in JDOM(-2).
>>>
>>> So: JDOM2 is doing a good job relative to JDOM1, but the XPath engine
is
>>> still very slow compared to Saxon's XPath engine.
>>>
>>> The Saxon code for accessing JDOM2 uses the JDOM node.getDescendants()
>>> method rather than making recursive use of getChildren() as we do with
>>> JDOM1, and this benefits performance in that without this change, the
>>> JDOM2 code ran in 12.28ms; but we're still getting slightly slower
>>> results from JDOM2 despite this improvement.
>>>
>>> I believe the way the measurements were done causes the XPath
expression
>>> to be compiled once and executed repeatedly.
>>>
>>> The differences we are seeing from these results are:
>>>
>>> (a) The TinyTree is very fast when processing the descendant axis
>>> (because the nodes are held in an array in document order)
>>>
>>> (b) In the scenario where XPath compile time is amortized over many
>>> executions (the only case we've measured), the Saxon XPath engine is
>>> much faster than the one built in to JDOM.
>>>
>>> (c) JDOM2 is fractionally slower than JDOM1 in its navigational APIs,
>>> even though its XPath engine is now three times faster.
>>>
>>> Michael Kay
>>> Saxonica
>>> _______________________________________________
>>> To control your jdom-interest membership:
>>>
http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com

From mike at saxonica.com  Tue Oct 25 06:19:53 2011
From: mike at saxonica.com (Michael Kay)
Date: Tue, 25 Oct 2011 14:19:53 +0100
Subject: [jdom-interest] Performance: JDOM2 and Saxon
In-Reply-To: <065757700bb856fa9c6421011fb8128e@tuis.net>
References: <4EA5540E.3080201@saxonica.com> <4EA55A22.2050702@saxonica.com>
	<3ec6e6c19f8da026efff1b706db90422@tuis.net>
	<4EA6938F.6050205@saxonica.com>
	<065757700bb856fa9c6421011fb8128e@tuis.net>
Message-ID: <4EA6B779.4010807@saxonica.com>

On 25/10/2011 13:26, Rolf Lear wrote:
> Excellent. I can work with that, and the feedback is appreciated.
>
> If nothing else, seeing the numbers creates something of a 'baseline'
> against which we can set expectations.
>
> Based on your response, and since you are the first 'users' of JDOM2
> speaking up (thanks) perhaps some follow-up comments:
> 1. the XPath area (org.jdom2.xpath.*) is expected to be revised still
> (issues #42 and #45). This may impact your work.
Well, apart from the comparative testing, we don't actually use that part
> 2. Have you identified any areas of JDOM2 code which are underperforming?
> You mention the "navigational API's" can you narrow that down to any
> particular iterators/methods?
I think that would need more careful study than we've carried out so 
far. But from what I've gleaned lurking on the list in the last couple 
of months, it wouldn't surprise me at all if namespaces are the culprit. 
They usually are.
> 3. In general is JDOM2 'better' to work with than JDOM1? Is it going in
> the right direction? Do you even notice it?
I don't think we'd have noticed it at all if we hadn't been deliberately 
exploring the way the descendant axis navigation is now done. To be 
honest, my main motivation was to see if there were any ideas here worth 
stealing.
> 4. Are there any other changes that would make your life easier
> (API/etc.)?
>
> Well, apart from a total redesign to make it more strongly typed... The most tedious part is probably merging adjacent text nodes. I imagine that's a usability hazard for ordinary users too. Also, I suspect that sorting nodes into document order is probably more expensive than it needs to be.
Michael Kay
Saxonica


From jdom at tuis.net  Sun Oct 30 18:56:44 2011
From: jdom at tuis.net (Rolf)
Date: Sun, 30 Oct 2011 21:56:44 -0400
Subject: [jdom-interest] Opinion Poll: - JDOM2 and minimum-required Java -
	Java5 or Java6
Message-ID: <4EAE005C.6020608@tuis.net>

Hi all.

When I started with JDOM2 I discussed with Jason whether we should 
target a minimum supported version of Java5 or Java6.

At the time I put together a list of Java6 features I thought could be 
remotely useful:

- JAXB 2.0 including StAX interfaces (we could have a StAXOutputter, etc.)
- Deque - ContentList/AttributeList could/should be a Deque too - making 
ContentList.removeLast() possible, etc.

Those were just off-the-cuff examples of what could be useful. I don't 
think that the ContentList/AttributeLists should be Deques... not now, 
anyway.

Based on that we decided Java5 was still a reasonable target, unless 
something came up.

My biggest concern is that if we introduce JDOM2 as supporting Java5 
then we will have to support Java5 for a long time, introducing a 
testing dependency, and potentially curtailing future options... and 
Java5 itself has been unsupported for years....

Also, as it happens, I have inadvertently (out of habit) used an 
ArrayDeque in the DescendantIterator code as a Stack. It could be easily
replaced with some other collections structure.

What has brought this issue to a head though is that I have been working 
on some StAX code, and this is fairly well entrenched in Java6. Support 
for it on Java5 is not nice to add (need to download special jars, 
etc.). It can be done, but is a mess.

Further, I have been looking at outputting to an XMLStreamWriter, and 
the support for that would be useful to add to the XMLOutputter class...
and requiring an additional hard-to-get jar for that would be a real 
drawback.... we may as well just declare it to require Java6.

So, as a poll:

* Does anyone have a realistic need to run a future JDOM2 on Java5?
* If so, could you add additional jars to your classpath just to make 
JDOM2 work?
* Any comments, suggestions.

Currently I feel that it is reasonable to set Java6 as a minumum and not 
even bother trying to think about Java5 issues... anyone disagree?

Rolf

From ian.lea at gmail.com  Mon Oct 31 02:36:12 2011
From: ian.lea at gmail.com (Ian Lea)
Date: Mon, 31 Oct 2011 09:36:12 +0000
Subject: [jdom-interest] Opinion Poll: - JDOM2 and minimum-required Java
 - Java5 or Java6
In-Reply-To: <07476A68-6C3E-432D-AE6C-B557A7AB6561@hoplahup.net>
References: <4EAE005C.6020608@tuis.net>
	<07476A68-6C3E-432D-AE6C-B557A7AB6561@hoplahup.net>
Message-ID: <CAEY5pxUDU=QJKFvYCt8bzuNJqYmv76kWa6QWq5Ys=Wno6YjjvA@mail.gmail.com>

+1 for java6


--
Ian.


On Mon, Oct 31, 2011 at 6:53 AM, Paul Libbrecht <paul at hoplahup.net> wrote:
>
> Le 31 oct. 2011 ? 02:56, Rolf a ?crit :
>
> * Does anyone have a realistic need to run a future JDOM2 on Java5?
>
> The only one is lack of time of our sysadmin.
> I think it can be ignored.
>
> * If so, could you add additional jars to your classpath just to make JDOM2
> work?
>
> Sure, in any case!
>
> * Any comments, suggestions.
>
> I'm all for jdk 6.
> We also only build for java 6.
> paul
>
> Currently I feel that it is reasonable to set Java6 as a minumum and not
> even bother trying to think about Java5 issues... anyone disagree?
>
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com
>


From mike at saxonica.com  Mon Oct 31 03:31:30 2011
From: mike at saxonica.com (Michael Kay)
Date: Mon, 31 Oct 2011 10:31:30 +0000
Subject: [jdom-interest] Opinion Poll: - JDOM2 and minimum-required Java
 - Java5 or Java6
In-Reply-To: <CAEY5pxUDU=QJKFvYCt8bzuNJqYmv76kWa6QWq5Ys=Wno6YjjvA@mail.gmail.com>
References: <4EAE005C.6020608@tuis.net>
	<07476A68-6C3E-432D-AE6C-B557A7AB6561@hoplahup.net>
	<CAEY5pxUDU=QJKFvYCt8bzuNJqYmv76kWa6QWq5Ys=Wno6YjjvA@mail.gmail.com>
Message-ID: <4EAE7902.5010100@saxonica.com>


* Does anyone have a realistic need to run a future JDOM2 on Java5?


I think in the context of a project where people can stick with JDOM1 if they wish, having a dependency on Java 6 seems reasonable at first blush.

But...

In Saxon we want to support JDOM2. But we also want Saxon to work on Java 5. We don't mind having a restriction that you can't use JDOM2 with Saxon if you're on Java 5. But we've certainly got some extra complexities making this work, e.g. we might need to isolate the code that references JDOM2 into a separate package that is compiled under Java 6 and is not statically referenced from the main Saxon JAR file.

So it would definitely be simpler for us and for our users if it all works under Java 5.

Similar complexities are likely to apply to many other components that want to integrate JDOM2.

 From past experience the more complex the application, the harder it is for people to move forward. Typical scenario: the user has a license for Oracle N; upgrading it to Oracle N+1 will cost millions, but Oracle N only runs under Java version J. There's no business justification for spending the millions, so they're stuck with Java J, and everything else they use in the same application then also has to run under Java J.

Michael Kay
Saxonica


From mikeb at mitre.org  Mon Oct 31 06:04:51 2011
From: mikeb at mitre.org (Brenner, Mike)
Date: Mon, 31 Oct 2011 13:04:51 +0000
Subject: [jdom-interest] Opinion Poll: - JDOM2 and minimum-required Java
 - Java5 or Java6
In-Reply-To: <4EAE7902.5010100@saxonica.com>
References: <4EAE005C.6020608@tuis.net>
	<07476A68-6C3E-432D-AE6C-B557A7AB6561@hoplahup.net>
	<CAEY5pxUDU=QJKFvYCt8bzuNJqYmv76kWa6QWq5Ys=Wno6YjjvA@mail.gmail.com>
	<4EAE7902.5010100@saxonica.com>
Message-ID: <264449A6A521A14593F58479F46B34EB0241AA@IMCMBX03.MITRE.ORG>

I do not.

-----Original Message-----
From: jdom-interest-bounces at jdom.org [mailto:jdom-interest-bounces at jdom.org] On Behalf Of Michael Kay
Sent: Monday, October 31, 2011 6:32 AM
To: jdom-interest at jdom.org
Subject: Re: [jdom-interest] Opinion Poll: - JDOM2 and minimum-required Java - Java5 or Java6


* Does anyone have a realistic need to run a future JDOM2 on Java5?


I think in the context of a project where people can stick with JDOM1 if they wish, having a dependency on Java 6 seems reasonable at first blush.

But...

In Saxon we want to support JDOM2. But we also want Saxon to work on Java 5. We don't mind having a restriction that you can't use JDOM2 with Saxon if you're on Java 5. But we've certainly got some extra complexities making this work, e.g. we might need to isolate the code that references JDOM2 into a separate package that is compiled under Java 6 and is not statically referenced from the main Saxon JAR file.

So it would definitely be simpler for us and for our users if it all works under Java 5.

Similar complexities are likely to apply to many other components that want to integrate JDOM2.

 From past experience the more complex the application, the harder it is for people to move forward. Typical scenario: the user has a license for Oracle N; upgrading it to Oracle N+1 will cost millions, but Oracle N only runs under Java version J. There's no business justification for spending the millions, so they're stuck with Java J, and everything else they use in the same application then also has to run under Java J.

Michael Kay
Saxonica


_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com


From jdom at tuis.net  Mon Oct 31 06:26:19 2011
From: jdom at tuis.net (Rolf Lear)
Date: Mon, 31 Oct 2011 09:26:19 -0400
Subject: [jdom-interest] Opinion Poll: - JDOM2 and minimum-required Java
 - Java5 or Java6
In-Reply-To: <4EAE005C.6020608@tuis.net>
References: <4EAE005C.6020608@tuis.net>
Message-ID: <716973aed0ca67fc72c533a66ebaf276@tuis.net>


I should add a time-line here.

I think I will sit on this for a couple of weeks... Say Friday the 18th - 
three weeks.

At that point I will summarize all the responses... and between now and
then I will also see if I can come up with a more detailed list of what the
implications for supporting Java5 are...

Then we can make a more informed decision.

A third option would be to only officially support Java6, but also put
together a document on how to make it work with Java5.

Rolf

On Sun, 30 Oct 2011 21:56:44 -0400, Rolf <jdom at tuis.net> wrote:
> Hi all.
> 
> ...
> 
> So, as a poll:
> 
> * Does anyone have a realistic need to run a future JDOM2 on Java5?
> * If so, could you add additional jars to your classpath just to make 
> JDOM2 work?
> * Any comments, suggestions.
> 
> Currently I feel that it is reasonable to set Java6 as a minumum and not

> even bother trying to think about Java5 issues... anyone disagree?
> 
> Rolf
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr at yourhost.com