[jdom-interest] setIgnoringElementContentWhitespace inoperant
per.norrman at austers.se
Wed Dec 8 15:47:16 PST 2004
Eric VERGNAUD wrote:
> I'm using jdom 1.0. I have the following code:
> public static Document ReadDocument(File inFile)
> throws JDOMException, IOException
> SAXBuilder sax = new SAXBuilder();
> return sax.build(inFile);
> I use this code to parse a document that has been serialized in pretty
> format. There are plenty of 0x0D 0x0A and 0x20 between the elements.
> I was hoping sax.setIgnoringElementContentWhitespace would clean that up,
> but it's not.
> Am I missing something ?
First, this feature depends on validation being turned on. Second,
it does not apply to mixed-mode content. Let's see if I can find
the stuff in the spec ....
Yes, here it is: http://www.w3.org/TR/2004/REC-xml-20040204/#sec-element-content
It would've been nice if there was a special whitespace character set that
could only be used for indentation and line-breaking purposes on
output; a character set that would always be skipped on input.
More information about the jdom-interest