[jdom-interest] XMLOutputter PI and formatting problem...

John Pang jpang at viviance.com
Thu Aug 9 11:34:06 PDT 2001


Hi,

I have a updateFile(File xml) method which makes some changes to some of the
elements in an xml file and saves. This method is called everytime someone
wants to make a change to the xml file. This method uses SAXBuilder (with
setExpandEntities(false)) to build the Document, and XMLOutputter (with the
setNewLines(true)) to output to the file.

I'm having some trouble with unexpanded entities. Basically, in the output,
each unexpanded entity starts on a new line. This is ok, but when I call the
updateFile() method on that file again, I get another line break added
before and after the unexpanded entity. So the more times I call the method,
the more line breaks I get between my entities!

I know I can use XMLOutputter.setTextNormalize(true). This cures the new
line problem, but introduces another problem. If a have 2 entities seperated
by a space e.g. "hellö übrella" (hellö ümbrella), the space will
get trimmed.

Also, new lines are not introduced when the element content is just a String
or CDATA but new lines are introduced for other combinations containing
Strings. e.g :

<element>hello world</element>  ---> this is ok

<element>hello w&ouml;rld</element> ---> the text starts on new lines
                                         (as well as the entity) and becomes
:

<element>
hello
w&ouml;
rld
</element>

If pass the second example through my updateFile() method, then I'm going to
get the introduction of more line breaks every time I use this method. :(

I was going to override the printElementContent() method in XMLOutputter in
a subclass, but this is not possible as it needs to access some private
variables. Would it be possible to make XMLOutputter more subclass friendly?
Or maybe change it directly so that new line handling is more consistent for
element text, and unexpanded entities do not start on new lines?

I've gone and done the later, and attached the code for the overriden
method. Would it be viable to do this? If not any ideas on what else I can
do?

suggestions appreciated,
John Pang

-------------- next part --------------
    /**
     * <p> This will handle printing out an <code>{@link
     * Element}</code>'s content only, not including its tag,
     * attributes, and namespace info.  </p>
     *
     * @param element <code>Element</code> to output.
     * @param out <code>Writer</code> to write to.
     * @param indent <code>int</code> level of indention.  */
    protected void printElementContent(Element element, Writer out,
                                       int indentLevel,
                                       NamespaceStack namespaces,
                                       List eltContent) throws IOException {
        // get same local flags as printElement does
        // a little redundant code-wise, but not performance-wise
        boolean empty = eltContent.size() == 0;

        // Calculate if the content is String/CDATA only
        boolean stringOnly = true;
        if (!empty) {
            stringOnly = isStringOnly(eltContent);
        }

        if (stringOnly) {
            Class justOutput = null;
            boolean endedWithWhite = false;
            Iterator itr = eltContent.iterator();
            while (itr.hasNext()) {
                Object content = itr.next();
                if (content instanceof String) {
                    String scontent = (String) content;
                    if (justOutput == CDATA.class &&
                          textNormalize &&
                          startsWithWhite(scontent)) {
                        out.write(" ");
                    }
                    printString(scontent, out);
                    endedWithWhite = endsWithWhite(scontent);
                    justOutput = String.class;
                }
                else {
                    // We're in a CDATA section
                    if (justOutput == String.class &&
                          textNormalize &&
                          endedWithWhite) {
                        out.write(" ");  // padding
                    }
                    printCDATA((CDATA)content, out);
                    justOutput = CDATA.class;
                }
            }
        }
        else {
            // Iterate through children
            Object content = null;
            Class justOutput = null;
            boolean endedWithWhite = false;
            boolean wasFullyWhite = false;
            Iterator itr = eltContent.iterator();
            while (itr.hasNext()) {
                content = itr.next();
                // See if text, an element, a PI or a comment
                if (content instanceof Comment) {
                    if (!(justOutput == String.class && wasFullyWhite)) {
                        maybePrintln(out);
                        indent(out, indentLevel);
                    }
                    printComment((Comment) content, out);
                    justOutput = Comment.class;
                }
                else if (content instanceof String) {
                    String scontent = (String) content;
                    if (justOutput == CDATA.class &&
                          textNormalize &&
                          startsWithWhite(scontent)) {
                        out.write(" ");
                    }
                    else if (justOutput != CDATA.class &&
//                             justOutput != String.class) {    // jp
                             justOutput != String.class &&      // jp
                             justOutput != null &&              // jp
                             justOutput != Element.class &&     // jp
                             justOutput != EntityRef.class) {   // jp
                        System.out.println("@@@ string =\"" + scontent + "\" justOutput = " + justOutput);
                        maybePrintln(out);
                        indent(out, indentLevel);
                    }
					System.out.println("$$$ string =\"" + scontent + "\" justOutput = " + justOutput);
                    printString(scontent, out);
                    endedWithWhite = endsWithWhite(scontent);
                    justOutput = String.class;
                    wasFullyWhite = (scontent.trim().length() == 0);
                }
                else if (content instanceof Element) {
                    if (!(justOutput == String.class/* && wasFullyWhite*/)) {   //jp
                        maybePrintln(out);
                        indent(out, indentLevel);
                    }
                    printElement((Element) content, out,
                                 indentLevel, namespaces);
                    justOutput = Element.class;
                }
                else if (content instanceof EntityRef) {
/* commented out by jp
                    if (!(justOutput == String.class && wasFullyWhite)) {
                        maybePrintln(out);
                        indent(out, indentLevel);
                    }
*/
                    printEntityRef((EntityRef) content, out);
                    justOutput = EntityRef.class;
                }
                else if (content instanceof ProcessingInstruction) {
                    if (!(justOutput == String.class && wasFullyWhite)) {
                        maybePrintln(out);
                        indent(out, indentLevel);
                    }
                    printProcessingInstruction((ProcessingInstruction) content,
                                               out);
                    justOutput = ProcessingInstruction.class;
                }
                else if (content instanceof CDATA) {
                    if (justOutput == String.class &&
                          textNormalize &&
                          endedWithWhite) {
                        out.write(" ");  // padding
                    }
                    else if (justOutput != String.class &&
                             justOutput != CDATA.class) {
                        maybePrintln(out);
                        indent(out, indentLevel);
                    }
                    printCDATA((CDATA)content, out);
                    justOutput = CDATA.class;
                }
                // Unsupported types are *not* printed, nor should they exist
            }
            if (justOutput != String.class) { //jp
                maybePrintln(out);
                indent(out, indentLevel - 1);
            } // jp
        }
    }  // printElementContent


More information about the jdom-interest mailing list