[jdom-interest] setIgnoringElementContentWhitespace inoperant ?

Wed Dec 8 16:41:50 PST 2004

le 9/12/04 0:37, Jason Hunter à jhunter at xquery.com a écrit :

> Not all whitespace is ignorable.  Ignorability comes from the DTD where
> the parser knows certain sections are element only so the whitespace
> doesn't matter.
> 
> Now, if you output from JDOM with a compact format then JDOM will strip
> the whitespace for you.
> 
> -jh-
> 
> Eric VERGNAUD wrote:
>> Hi,
>> 
>> I'm using jdom 1.0. I have the following code:
>> 
>>    public static Document ReadDocument(File inFile)
>>         throws JDOMException, IOException
>>     {
>>         SAXBuilder sax = new SAXBuilder();
>>          sax.setIgnoringElementContentWhitespace(true);
>>          return sax.build(inFile);
>>      }
>> 
>> I use this code to parse a document that has been serialized in pretty
>> format. There are plenty of 0x0D 0x0A and 0x20 between the elements.
>> I was hoping sax.setIgnoringElementContentWhitespace would clean that up,
>> but it's not.
>> 
>> Am I missing something ?
>> 
>> Eric
>> 
>> 

I want the final document to look pretty, so doing the cleanup on output
would require 2 cycles, parse document, save with compact format, read
compact document, save with pretty format.

Since these particular documents have no DTD, is there a way to tell the
parser that there are no mixed elements, and thus perform the cleanup ?

These documents are not serialized using JDOM. They are uploaded to a Tomcat
servlet which simply parses them (using JDOM), and sometimes sends them as
web content. It's this web content I want to look as nice as possible.

Eric