[jdom-interest] XMLOutputter changes from b8 to b9

Phil Weighill-Smith phil.weighill-smith at volantis.com
Sun Jan 18 12:05:42 PST 2004


I'm only 80% sure an XSD can do what Dave wants (we only use XSDs these days in preference to DTDs. We define the attribute and element content types: you can create restricted string types for defining the allowed content of an attribute or an element for example, which explicitly allow leading and/or trailing spaces). Clearly, you also have to enable XSD validation in the parser. You *could* be right though since we've not tried this sort of thing with an XSD... I guess it's "give it a try" or "RTFM" or some such... ;n)

	-----Original Message----- 
	From: Dennis Sosnoski [mailto:dms at sosnoski.com] 
	Sent: Fri 16/01/2004 20:05 
	To: Phil Weighill-Smith 
	Cc: Beleznay, Dave; jdom-interest at jdom.org 
	Subject: Re: [jdom-interest] XMLOutputter changes from b8 to b9
	
	

	I don't think an XSD will do this for you - AFAIK in order to be
	reported as ignorable whitespace you have to use a DTD, since that's how
	the XML recommendation defines it.
	
	  - Dennis
	
	Phil Weighill-Smith wrote:
	
	> Have you tried 1) having an XSD that indicates where whitespace is not
	> ignorable and 2) calling
	> SAXBuilder#setIgnoringElementContentWhitespace(true)?
	>
	> By doing both of these you give the (XERCES) parser the chance to
	> determine what whitespace is ignorable and the SAXBuilder the ability
	> to correctly ignore ignorable whitespace...
	>
	> Phil :n)
	>
	> On Thu, 2004-01-15 at 23:40, Beleznay, Dave wrote:
	>
	>>/Hi There,
	>>
	>>We've recently upgraded from Jdom b8 to b9, and had a few errors in the
	>>upgrade process.
	>>
	>>If I have an XML document like so:
	>>
	>>        String xml =
	>>            "<enterprise>\n"+
	>>            "<properties>\n"+
	>>            "    <datasource>WebCT</datasource>\n"+
	>>            "    <type>  Migration</type>\n"+
	>>            "    <datetime>2002-06-06T14:59:05</datetime>\n"+
	>>            "</properties>\n"+
	>>            "</enterprise>\n";
	>>
	>>Where the spaces in front of <datasource> are not relevant, but the
	>>spaces inside the elements (e.g. <type>) are relevant (this is just a
	>>fragment of a larger bit of XML, it isn't really the <type> field that
	>>matters here). When I put the document into Jdom, I get different
	>>behaviour between Jdom b8 and b9.  I'd like to know the expected
	>>behaviour, and if it doesn't match my desired behaviour, approximately
	>>how I'm supposed to fix my code. 
	>>
	>>
	>>Using the string above and the following code in Jdom b8  I get the
	>>output below.
	>>
	>>        SAXBuilder builder = new SAXBuilder();
	>>        Document doc = builder.build(new StringReader(xml));
	>>       
	>>        XMLOutputter xmlOutputter = new XMLOutputter("\t", true);
	>>        xmlOutputter.setOmitDeclaration(true);
	>>        xmlOutputter.setLineSeparator("\n");
	>>        //xmlOutputter.setTextTrim(true);
	>>        String output = xmlOutputter.outputString(doc.getRootElement());
	>>        System.out.println(output);
	>>
	>>Desired output ( and output received from b8):
	>>
	>>output=
	>>"<enterprise>\n\t<properties>\n\t\t<datasource>WebCT</datasource>\n\t\t<
	>>type>
	>>Migration</type>\n\t\t<datetime>2002-06-06T14:59:05</datetime>\n\t</prop
	>>erties>\n</enterprise>"
	>>
	>>
	>>When we upgraded to Jdom b9 we were in for a little bit of a surprise.
	>>
	>>Output from b9 without TextTrim:
	>>
	>>output= "<enterprise>\n\t\n\n\t<properties>\n\t\t\n
	>>\n\t\t<datasource>WebCT</datasource>\n\t\t\n    \n\t\t<type>
	>>Migration</type>\n\t\t\n
	>>\n\t\t<datetime>2002-06-06T14:59:05</datetime>\n\t\t\n\n\t</properties>\
	>>n\t\n\n</enterprise>"
	>>
	>>Output from b9 with TextTrim:
	>>
	>>output=
	>>"<enterprise>\n\t<properties>\n\t\t<datasource>WebCT</datasource>\n\t\t<
	>>type>Migration</type>\n\t\t<datetime>2002-06-06T14:59:05</datetime>\n\t<
	>>/properties>\n</enterprise>"
	>>
	>>(this is close, but took the spaces out before "   Migration" )
	>>
	>>Unfortunately the code farther down the line (not using jdom) which is
	>>analyzing the xml has problems with the string  "\n\t\t\n
	>>\n\t\t<datasource>WebCT<datasource>" and interprets the value as "
	>>WebCT". I'm not happy with that either, but right now it's easier to fix
	>>the behaviour of Jdom.
	>>
	>>It looks like this was changed XMLOutputter 1.87, and I'm trying to
	>>figure out why.  I'd like to remove the whitespace outside the elements,
	>>while preserving the whitespace inside.  As a temporary measure I've
	>>added the check for currentFormat.newlines back to our skipLeadingWhite
	>>method in XMLOutputter, but I'd like a more permanent solution. 
	>>
	>>Thank you very much.
	>>
	>>Cheers,
	>>
	>>Dav
	>>
	>>--
	>>David Beleznay
	>>Software Engineer
	>>WebCT
	>>_______________________________________________
	>>To control your jdom-interest membership:/
	>>/_http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com_/
	>>
	> -- Phil Weighill-Smith <_phil.weighill-smith at volantis.com_
	> <mailto:phil.weighill-smith at volantis.com>> Volantis Systems
	>
	>
	> 
	>
	
	



More information about the jdom-interest mailing list