[jdom-interest] JDOMSource problems with Xalan 2.5.2: vanishing DOCTYPE white spaces

Laurent Bihanic laurent.bihanic at atosorigin.com
Wed May 5 06:37:28 PDT 2004


Hi,

While looking at SAXOutputter's bug reported by Gary, I noticed that 
SAXOutputter code for outputting internal subset is strangely different from 
the same code in XMLOutputter.
More precisely, SAXOuputter outputs either the DTD public and system ids or 
the internal subset while XMLOutputter can also output both (ids and internal 
subset).
See methods dtdEvents(Document document) in SAXOutputter and 
printDocType(Writer out, DocType docType) in XMLOutputter.

Can a DTD expert have a look and tell me which code is the correct one?
(I suspect that XMLOutputter is right.)

Laurent


Laurent Bihanic wrote:
> Indeed, this is a bug in SAXOutputter.
> 
> To fix it, look for the method dtdEvents(Document document). In the 
> following code:
>                // No internal subset defined => Try to parse original DTD
>                if ((publicID != null) || (systemID != null)) {
>                     if (publicID != null) {
>                         buf.append(" PUBLIC ");
>                         buf.append('\"').append(publicID).append('\"');
>                     }
>                     else {
>                         buf.append(" SYSTEM ");
>                     }
>                     buf.append('\"').append(systemID).append('\"');
>                 }
>                 else {
>                     // Doctype is totally empty! => Skip parsing
>                     buf.setLength(0);
>                 }
> 
> Replace the lines:
>                         buf.append(" SYSTEM ");
>                     }
>                     buf.append('\"').append(systemID).append('\"');
> by:
>                         buf.append(" SYSTEM");
>                     }
>                     buf.append(" \"").append(systemID).append('\"');
> 
> Hope this helps,
> 
> Laurent
> 
> 
> Gary Lawrence Murphy wrote:
> 
>> I have a legacy application that worked fine with jdom-b8 and
>> Xalan 2.0, but for other reasons we have to upgrade to Xalan 2.5.2
>> which we've tried with both the XercesImpl shipped with it and with 
>> Xerces 2.6.0 with the same results:
>>
>> 1) First, we create the jdom.Document dom ...
>>
>>    SAXBuilder builder = new SAXBuilder(getSaxClass(), false);
>>
>>    builder.setFeature(
>>        "http://apache.org/xml/features/nonvalidating/load-external-dtd",
>>         false);
>>    dom  = builder.build(new StringReader(doc.toString()));
>>
>> 2) Then create the JDOMSource and transformer
>>
>>    JDOMSource jds = new JDOMSource(dom);
>>
>>    TransformerFactory tfact = TransformerFactory.newInstance();
>>    // set tfact optimize on and incremental off
>>
>>    Transformer xslt = tfact.newTransformer(new StreamSource(xsl));
>>    // xsl is a File object to our transform
>>
>> 3) I then apply the transform ...
>>
>>    JDOMResult newdom = new JDOMResult();
>>    xslt.transform( jds, newdom);
>>
>> and get an untrappable error on stderr:
>>
>> [Fatal Error] :1:53: White spaces are required between publicId and 
>> systemId.
>>
>> "1:53" does indeed refer to the space between the public and system
>> identifiers in the input files, but those identifiers are clearly
>> seperated by a space, a real ASCII space.
>>
>> when I check the input documents ...
>>
>>    DocType dtd = dom.getDocType();
>>
>>     cat.debug( "Process " + doc.getTag()
>>                + "\n PublicID = " + dtd.getPublicID()
>>                + "\n SystemID = " + dtd.getSystemID());
>>
>>     cat.debug( dom.toString());
>>
>> I get data that looks just fine ...
>>
>>     2004-01-16 15:19:07,396 [Thread-2] DEBUG     
>> ca.cbc.sportwire.dochandler.ToNewsMLFilter  -     Process 
>> AutoRacingDriverProfile.dtd #1167699
>>     PublicID = -//TSN//DTD Leader 1.0/EN
>>     SystemID = 
>> file:///home/ticker/ticker/fantasysports/tsn/AutoRacingDriverProfile.dtd
>>
>>     2004-01-16 15:19:07,396 [Thread-2] DEBUG     
>> ca.cbc.sportwire.dochandler.ToNewsMLFilter  -
>>     [Document: [DocType: <!DOCTYPE message PUBLIC "-//TSN//DTD Leader 
>> 1.0/EN" 
>> "file:///home/ticker/ticker/fantasysports/tsn/AutoRacingDriverProfile.dtd">], 
>> Root is [Element: <message/>]]
>>
>> This suggests a problem somewhere between the jdom.Document and the
>> Transformer, as if the JDOMSource is somehow trimming this space.
>>
>> This code worked fine with the older Xalan and I have upgraded the
>> following jars to match the Xalan release:
>>
>>     xalan-2.5.2.jar
>>     xercesImpl-2.6.0.jar
>>     xml-apis-2.6.0.jar
>>
>> and the associated JDOM jar files ...
>>
>>     28404 Jan 15 16:35 jaxp.jar
>>     160967 Jan 15 15:14 jaxen-core.jar
>>     5949 Jan 15 15:14 jaxen-jdom.jar
>>     135363 Jan 15 15:14 jdom.jar
>>     23563 Jan 15 15:14 saxpath.jar
>>
>> I am using Linux 2.4.21-0.13mdk i686 and java version "1.4.1_03"
>>
>>     Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_03-b02)
>>     Java HotSpot(TM) Client VM (build 1.4.1_03-b02, mixed mode)
>>
>> I've verified the same results using 1.4.2 both on the Sun and the
>> Blackdown ports.
>>
>> What's worse, while we get this "Fatal Error" on every document, it
>> does not appear to be fatal at all; processing continues on after the
>> transform, and the transformed doc is correctly generated!
>>
>> Any and all insights, ideas or probable cause theories are most
>> welcome.



More information about the jdom-interest mailing list