[jdom-interest] XMLOutputter and newlines after declaration/d octype

Norrman Per per.norrman at canovia.se
Wed Dec 18 17:18:13 PST 2002


OK, I buy that.
I'm henceforth in complete disagreement with my latest post ;-)

/pmn

-----Original Message-----
From: Alex Rosen
To: hip at a.cs.okstate.edu; per.norrman at canovia.se;
Vadim.Strizhevsky at morganstanley.com
Cc: jdom-interest at jdom.org
Sent: 2002-12-19 01:47
Subject: RE: [jdom-interest] XMLOutputter and newlines after
declaration/doctype

Unfortunately it's trickier than that. Parsers do not provide any
information about a document's formatting, other than the logical
meaning of the document. For example, suppose you have:

<?xml version="1.0"?>
<!DOCTYPE repository
   PUBLIC "-//blah..."
   "http://blah..."> 
<repository>
   <item
        att1="value1"
        att2="value2"
   />
</repository>

The only whitespace that the parser tells JDOM about is the whitespace
after each element. We don't know that there's a newline after the XML
declaration and the DOCTYPE declaration, and we don't know that the
DOCTYPE and the attributes are split up into multiple lines. So there's
no way we can output the file exactly the way it came in. When we output
the file, we just have to choose some arbitrary formatting in these
places, and hope that it's what the user wants. (Or that the user
doesn't care very much.) The current choice, where we add newlines after
the XML and DOCTYPE declarations, and all attributes are put on the same
line, is probably what most people want, and that's the best we can do
automatically.

Separately, there's the problem that when you're building a document
in-memory, your elements will normally will be output all on one line,
because you haven't added any inter-element whitespace:

<repository><item att1="value1" att2="value2"/></repository>

You could fix this by manually adding the correct text nodes under each
element, keeping track of the indent level, but that's a big pain. So we
added this pretty-printing feature to XMLOutputter, so it can add the
newlines and indents for you on output. So if you're parsing and then
outputting the document, you should not use this pretty-printing
feature, because the original document probably already has
inter-element whitespace that you want to preserve, but if you're
creating the document in-memory, you should use it, to make the output
readable. But note that this is a separate issue from whether or not we
use newlines after the XML and DOCTYPE declarations, which is a choice
that should not be influenced by where the document comes from.

We could add a separate setting to control this newline behavior, but
clearly that falls well on the other side of the 80/20 rule. Some
projects would add to their API for the 5% case, and some wouldn't;
generally JDOM doesn't like to.

Alex

>>> Norrman Per <per.norrman at canovia.se> 12/18/02 07:00PM >>>
Actually, your suggestion produces exactly the same output
as
   XMLOutputter.output(document, out);


I think Vadim is right. XMLOutputter#printDocType (and other methods)
should respect the newLines flag. If I parse the document

<?xml version="1.0" ?><!DOCTYPE repository SYSTEM "repository.dtd"
><repository/>

it should not come back as

<?xml version="1.0" ?>
<!DOCTYPE repository SYSTEM "repository.dtd" >
<repository/>

especially when considering this excerpt from XMLOutputter javadoc:

Several modes are available to effect the way textual content is
printed.
All modes are configurable through corresponding set*() methods. Below
is a
table which explains the modes and the effect on the resulting output.


Text Mode    Resulting behavior.  
Default      All content is printed in the format it was created, no
             whitespace or line separators are are added or removed.  

Just my opinion.

/pmn


-----Original Message-----
From: Bradley S. Huffman
To: Vadim.Strizhevsky at morganstanley.com 
Cc: jdom-interest at jdom.org 
Sent: 2002-12-18 23:44
Subject: Re: [jdom-interest] XMLOutputter and newlines after
declaration/doctype 

Vadim Strizhevsky writes:

> For now I will have XMLOutputter subclass that duplicates
> printDeclaration and printDocType methods and doesn't print the
newlines
> when newline option is set to false.

Or use a sequence like

       out.write("<?xml version=\"1.0\"?>");
       XMLOutputter.output(document.get DocType, out);
       XMLOutputter.output(document.getRootElement(), out);

> Also I noticed a signle use  of "\n" instead of lineSeparator in the
> latest printDocType code.

Hmm, thought that was changed.

Brad
_______________________________________________
To control your jdom-interest membership:
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@you

rhost.com
###########################################

This message has been scanned by F-Secure Anti-Virus for Microsoft
Exchange.
For more information, connect to http://www.F-Secure.com/ 
_______________________________________________
To control your jdom-interest membership:
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@you
rhost.com
###########################################

This message has been scanned by F-Secure Anti-Virus for Microsoft Exchange.
For more information, connect to http://www.F-Secure.com/



More information about the jdom-interest mailing list