[jdom-interest] XMLOuputter = SAXOutputter + XMLFilter + XMLWriter

Joseph Bowbeer jozart at csi.com
Mon Oct 9 13:16:15 PDT 2000

I've been thinking about SAX event streams and SAX filters, and how JDOM
should interface with these.  I've also been thinking about the kind of
support JDOM should provide for data documents (no mixed content).

The use case leading to these thoughts was the necessity to write JDOM
elements onto a SAX event stream.  For example: outputting an unknown
number of entries (child elements) to a log file.

After experimenting with JDOM's XMLOutputter and looking at David
Megginson's XMLWriter and DataWriter, I'd like to suggest the following
refactoring of XMLOutputter.

  XMLOutputter = SAXOutputter + XMLFilter + XMLWriter

1. SAXOutputter should be the cornerstone of JDOM output.  (When it is
implemented, SAXOutputter will convert JDOM pieces into SAX events.)

Why make SAXOutputter the cornerstone?  Because that's where the best
leverage is.  If we can generate SAX events correctly, we can do
anything related to output.  The addition (or removal) of indentation
and newlines can be viewed as a filter acting on the SAX event stream.

2. Take most of what is currently in XMLOutputter and move it into
something along the lines of Megginson's XMLWriter.

Note: XMLWriter takes a SAX event stream and writes out XML.  XMLWriter
should (for example) provide options for customizing the appearance of
the XML header.  It should not, however, provide options for adding
newlines or indentation, because all whitespace is potentially
significant (but see #3 below).

As with Megginson's version, our XMLWriter should implement XMLFilter.
Since an XMLFilter is an event source as well as an event sink, an
XMLWriter can be inserted into the middle of an event stream without
interrupting the flow, and the XML output can be sluiced out the side.

3. For formatting "data documents", implement a special XMLFilter.  Call
it DataFormatFilter.

Note the term "data documents".  In Megginson's terminology, these are
documents that contain only fielded content (no mixed content).  This
data format filter inserts the additional newlines and indentation that
are needed to "pretty-print" the data document.  The indent width,
indent  character, and line ending should all be customizable.

Note: The DataFormatFilter is similar to Megginson's DataWriter, except
it should be implemented as a pure filter rather than as a subclass of
XMLWriter.  (Filter composition rocks; subclassing XMLWriter is fragile
and unnecessary in this case.)

(I plan to implement the DataFormatFilter, and the related
DataUnformatFilter described later.)

4. Finally, XMLOutputter becomes a convenience class that provides the
same toplevel "output" methods it does now.

XMLOutputter is responsible for creating the constituent components,
hooking them up to form an output pipeline, and delegating to them.

Comments?  XMLFilter is a SAX2 thing, which was released 5/2000.  Does
this matter?

Here are some related ideas for pipelining the input side:

  XMLReader + XMLFilter + SAXBuilder = JDOM document

1. Add an optional DataUnformatFilter to remove newlines and indentation
from data documents.  This filter reads the SAX event stream from the
reader/parser and passes it through to the SAXBuilder, removing the
extra formatting along the way.

2. For added convenience, the SAXBuilder should implement XMLFilter.

Joe Bowbeer

More information about the jdom-interest mailing list