[jdom-interest] Fwd: Formatting differences after migrating to JDOM2

Robert Krüger krueger at lesspain.de
Mon Oct 7 00:40:12 PDT 2013


On Sun, Oct 6, 2013 at 10:05 PM, Rolf <jdom at tuis.net> wrote:
> Hi Robert.
>
> OK. I have spent some time going through things, and, admittedly, this is
> confusing, and working through the combinations/permutations for formatting
> is liable to end in a headache.
>
> So, I think I have resolved that there are a number of issues at hand in
> your case:
> 1. JDOM2 is doing different things than JDOM1
> 2. JDOM1 is probably doing the wrong thing in this case
> 3. JDOM2 is also probably doing the wrong thing, but, in fairness, changing
> the 'TextMode' of a PrettyPrint format is a 'dangerous' thing .... not by
> design, but because of the actual implementation and choices the formatter
> makes with the pretty format.
> 4. If whitespace is significant for certain members of an XML document then
> you should not be relying on the whim of JDOM to make things right, but you
> should be using the xml:space="preserve" mechanism that is designed for this
> purpose.
>
> So, here are a few 'answers'.
>
> Answer 0:
> =====================================================
> The output you are getting from JDOM 1.x is broken. If you have a 'preserve'
> text mode then there should be no whitespace between any elements
> (indenting/newlines) because that is not 'preserved' space (it's 'invented'
> whitespace).

Yes, after thinking about it that was more or less the answer I expected.

>
> The JDOM output you currently get is relying on a bug in JDOM 1.x
>
> Answer 1:
> =====================================================
> The "right" thing for you to do is to add the xml:space="preserve" to the
> sub2 elements:
>
>
>     public static void main(String argv[]) throws Exception{
>         Document document = new Document();
>         Attribute cloneme = new Attribute("space", "preserve",
> Namespace.XML_NAMESPACE);
>
>         Element root = new Element("root");
>         document.addContent(root);
>         Element sub1 = new Element("sub1");
>         root.addContent(sub1);
>         sub1.addContent(new Element("sub2").setText("Some
> text").setAttribute(cloneme.clone()));
>         sub1.addContent(new Element("sub2").setText("  text with left and
> right whitespace  ").setAttribute(cloneme.clone()));
>         Format fmt = Format.getPrettyFormat();
>         XMLOutputter xout = new XMLOutputter(fmt);
>         xout.output(document, System.out);
>     }
>
> Gives the output:
>
> <root>
>   <sub1>
>     <sub2 xml:space="preserve">Some text</sub2>
>     <sub2 xml:space="preserve">  text with left and right whitespace
> </sub2>
>   </sub1>
> </root>
>
> Answer 2:
> =====================================================
> The "OK" thing for you to do is to use the TextMode.TRIM_FULL_WHITE instead
> of TextMode.PRESERVE... the default TextMode for PrettyPrint is
> TextMode.TRIM, which removes white-space from either-end of the text, but
> the TRIM_FULL_WHITE will remove whitespace only when there's only
> whitespace, and will do nothing if there's any non-whitespace characters. I
> want you to be aware that other tools (JDOM, xmllint) have the right to mess
> with the whitespace ( http://www.w3.org/TR/REC-xml/#sec-white-space ). It is
> only by convention that the following will work in JDOM (I recommend
> preserving whitespace correctly with xml:space="preserve") :
>
>
>     public static void main(String argv[]) throws Exception{
>         Document document = new Document();
>         Element root = new Element("root");
>         document.addContent(root);
>         Element sub1 = new Element("sub1");
>         root.addContent(sub1);
>         sub1.addContent(new Element("sub2").setText("Some text"));
>         sub1.addContent(new Element("sub2").setText("  text with left and
> right whitespace  "));
>         Format fmt = Format.getPrettyFormat();
>         fmt.setTextMode(Format.TextMode.TRIM_FULL_WHITE);
>         XMLOutputter xout = new XMLOutputter(fmt);
>         xout.output(document, System.out);
>     }
>
> Gives the output:
>
>
> <root>
>   <sub1>
>     <sub2>Some text</sub2>
>     <sub2>  text with left and right whitespace  </sub2>
>   </sub1>
> </root>
>
> Answer 3:
> =====================================================
> JDOM 2.x uses a different (faster, and more flexible) algorithm for output
> handling. This algorithm has two major triggers: The TextMode and the
> Indent. PrettyPrint sets the TextMode to TRIM and the Indent to two spaces "
> ". The TRIM mode tells JDOM it can mess with whitespace in Text. The INDENT
> tells JDOM it can mess with the formatting of the XML structure (setting it
> to null tells JDOM not to mess with any indenting).
> You have been changing the TextMode to PRESERVE, and, as I think about that,
> JDOM should never mess with the indenting when the mode is PRESERVE. JDOM
> has code to make sure that it manages the INDENT and the TextMode correctly
> when they need to change internally, but you are basically setting an
> invalid situation by setting INDENT and PRESERVE at the same time. JDOM
> should handle that better.
>
> But, the right thing to do, is when you set PRESERVE, JDOM2 should output
> the following:
>
> <root><sub1><sub2>Some text</sub2><sub2>  text with left and right
> whitespace  </sub2></sub1></root>
>
> So, I think there's a bug in JDOM2, and, given the input you have
> (Format.getPrettyFormat().setTextMode(TextMode.PRESERVE) ) It should be
> outputting the above (which is not what you want).
>
> Answer 4:
> =====================================================
> You can use the Raw format, and output the spaces yourself by adding your
> own indenting and newlines.
>

Thanks a lot for your in-depth answer! It helps a lot.

Robert


More information about the jdom-interest mailing list