[jdom-interest] XOM

Tue Sep 24 14:27:01 PDT 2002

At 4:56 PM -0400 9/24/02, New, Cecil (GEAE) wrote:
>speaking of JDOM problems... anyone notice the XOM announcement.  Reactions?

When changing the subject so abruptly, it's nice to change the 
subject line. :-)

>XOM: New tree-based XML API
>http://xmlhack.com/read.php?item=1783
>
>   Elliotte Rusty Harold has announced a new tree-based XML API, XOM
>   (from the generic term XML object model), which he describes as
>   "closest in spirit to JDOM" and representing an "effort to synthesize
>   the best features of the existing APIs while eliminating the worst".
>   (Java, Tools: 01:46 21 Sep 2002 UTC)
>

Well, I was waiting till I had a little more of the documentation 
finished, but here's the semi-official XOM word on the subject:

XOM and JDOM are almost completely separate products. Originally, I had
thought I could build XOM by forking JDOM, but it quickly became
apparent that it would be simpler to start over from scratch. XOM does
use one class from JDOM (Verifier) in its internal, non-public parts.
This class has been rewritten substantially. The rest of the API is
completely free of JDOM code.

Conceptually, XOM definitely did adopt a number of ideas from JDOM,
including:

* Using a SAX parser to build the object model.
* Using a NodeFactory to support subclasses through the builder.
* Subclassing SAXSource and SAXResult to support TrAX.

However, XOM also freely borrowed good ideas from DOM, XPath, and other
systems, and invented not a few new ones of its own. Features in XOM
that have no real equivalent in JDOM include:

* A common Node superclass

* The getValue method that returns the XPath value of any node.

* The getStringForm method that returns a string containing the XML
serialization of that node. (JDOM actually did use the toString method
for this in the first few betas. However, when JDOM decided to use the
toString method for debugging info instead, they never replaced it with
another method.)

* Direct node-to-node navigation using getNextSibling, getPreviousSibling,
getParent, and getFirstChild.

* Well-formedness safe subclassing.

There are also many features that JDOM and XOM share, but that are
implemented very differently in the two APIs:

* In XOM namespaces are just strings. In JDOM namespaces are instances of
the Namespace class.

* In JDOM, an Element contains a list. In XOM, an Element is a list. This
makes for very different styles of navigation.

* JDOM exposes lists using the java.util.List class to expose live lists
of attributes and content. XOM uses comatose, read-only lists
implemented with custom classes. Unlike standard lists, XOM lists expose
the types of the nodes they contain. That is, there are separate lists
for attributes, elements, namespaces, and so forth.

* Internally, JDOM uses a very sophisticated filter list that knows a
great deal about the types of nodes it contains. However, this
information is not exposed in the public API. XOM is almost exactly
backwards from this. Internally, it uses very simple lists that know
nothing about the types of the things they contain. Externally, however,
it exposes lists that contain nodes of very specific types.

* JDOM passes prefixes and local parts separately to setter methods. XOM
expects them to be passed as a single qualified name.

* JDOM supports skipped entity references. XOM requires all entity
references to be resolved.

* JDOM reports CDATA sections. XOM automatically merges them with their
surrounding text.

Finally, XOM hews closely to the motto that "Less is more." It
deliberately eschews the many convenience methods that make the JDOM
API so cluttered such as
getChildText/getChildTextTrim/getChildTextNormalize/getText/getTextTrim/getTextNormalize.
If overloaded variants are included, JDOM has nine separate methods for
reading the text of an element. XOM has one, getValue. If you need to
trim or normalize the value, you can use the methods of the String class.

>Watch out, Jason's getting philosophical...
>
>We've been slowly but steadily increasing the number of protected fields
>in the input/output classes to the point where nearly the entire
>internals are exposed.  It's one of the things I dislike most about

In XOM, I hide much more, and the stuff that is exposed is very 
carefully thought out. This does limit what subclasses can do, but it 
makes for much more robust XML. Preconditions and postconditions are 
enforced on subclasses. If you can sneak malformed XML through XOM, 
even by subclassing, then it's a bug and I will fix it.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+