[jdom-interest] Beta8 Document and DocType semantics incorrect?

Les Hill leh at galaxynine.com
Mon Apr 22 13:36:22 PDT 2002


<jhunter at servlets.com> writes:
> In JDOM the DocType is not treated as content of the document; it's
> treated as a property of the document.  That's far more natural for
> programmers.  I don't believe this violates production 22 because that's
> for the *text* representation of the document, and JDOM both outputs and
> reads XML in accordance with that production rule.

While I agree with making things more natural for programmers, production 22
is more than a textual representation as it structurally puts comments and
PIs as occuring before *and* after a doc type declaration.  One simple use
case is a PI that directs the parsing of the document occuring before either
the internal or external subsets.

The current implementation of Document completely divorces DocType from any
other Document content, on input you completely lose the use case above --
there is no way to indentify whether a comment or PI occurred before or
after the doc type  -- and on output you end up with ugly special casing:

    DocType dt = doc.getDocType();
    List content = doc.getContent();

    // Output before DocType
    for (int i = 0; i < content.size(); ++i)
    {
        // Code for finding the magicInsertIndex is elsewhere
        if (i == magicInsertIndex)
        {
            output(dt);
        }
        output(content.get(i));
    }

instead of the straightforward:

    List content = doc.getContent();
    for (Iterator i = content.iterator(); i.hasNext();)
    {
        output(content.next());
    }

Les Hill
leh at galaxynine.com





More information about the jdom-interest mailing list