[jdom-interest] SAX Parser/DTD Validation EXTREMELY slow
Dan.Temple at wcom.com
Wed Oct 16 15:57:13 PDT 2002
Thanks Frank, that did it. All I had to do was put the latest xercesImpl.jar in my classpath. It parsed in 1282 milliseconds instead of the estimated 26+ hours. Nasty!
I appreciate it.
----- Original Message -----
From: Frank Sauer
To: Dan Temple ; jdom-interest at jdom.org
Sent: Wednesday, October 16, 2002 3:23 PM
Subject: RE: [jdom-interest] SAX Parser/DTD Validation EXTREMELY slow
Have you tried another XML parser than the one packaged in the 1.4 JRE?
Here's something I plucked off the web somewhere on how to do that (replace Xalan with Xerces, same applies):
Annoyingly, the Xalan-J classes included in Java 1.4 are zipped into the rt.jar archive so it's hard to replace them with a less-buggy release version of Xalan. It can be done, but you have to put the xalan.jar file in your $JAVA_HOME/lib/endorsed directory rather than in the normal jre/lib/ext directory. The exact location of $JAVA_HOME varies from system to system, but it's probably something like C:\j2sdk1.4.0 on Windows. None of this is an issue with Java 1.3 and earlier, which don't bundle these classes. On these systems you just need to install whatever jar files your XSLT engine vendor provides in the usual locations, the same as you would any other third party library.
From: Dan Temple [mailto:Dan.Temple at wcom.com]
Sent: Wednesday, October 16, 2002 5:14 PM
To: jdom-interest at jdom.org
Subject: [jdom-interest] SAX Parser/DTD Validation EXTREMELY slow
I have a DTD ELEMENT sequence that has 47 children. When I turn on SAX validation (SAXBuilder(true)) it takes forever to build (saxBuilder.build(File)). OK, not actually forever, but 26 hours! It takes so long that at first I thought that it was in an infinite loop - CPU was at 100%.
By playing with the XML & DTD, I was able to narrow down the problem to just the number of entries in the DTD ELEMENT sequence. It's falling down at about 20 entries and goes up exponentially from there. For every additional child entry, the time it takes to validate DOUBLES (almost exactly).
My largest actual test was 35 entries & sure enough - over 1 1/2 hours. By my calculations, to process my 47 entries will take over 26 hours. The only solution for me is to turn off SAX parser validation.
I am using the latest code from CVS - Beta8 (tag: jdom_1_0_b8). I am using Java 1.4.0_02. The problem occurs on both Windows 2000 & Solaris 8.
Attached is XMLTest.java, test.xml, & test.dtd. Just compile & run the XMLTest class, it's hard coded to use the test.xml file which references the test.dtd file.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the jdom-interest