[jdom-interest] setIgnoringElementContentWhitespace inoperant ?
cowtowncoder at yahoo.com
Thu Dec 9 11:00:23 PST 2004
--- Jason Hunter <jhunter at xquery.com> wrote:
> > Since these particular documents have no DTD, is
> there a way to tell the
> > parser that there are no mixed elements, and thus
> perform the cleanup ?
> Nope; I've long wanted an
> input.setIgnoringAllWhitespace() method on the
> SAXBuilder but when I started implementing it (it'd
> have to be a JDOM
> level feature as SAX parsers don't have such a
> notion), I found it
> difficult to do without a lot of look ahead.
> Perhaps it'd be easier
> with today's SAXBuilder design. You're welcome to
> implement such a
> feature. I'd be happy to commit it.
I know this is bit of a shameless plug, but it might
be worth considering using a StAX parser for this
It not only allows for requiring all contiguous text
(including CDATA) to be combined, but also simplifies
the operations of look-ahead etc, due to application
being in control, not event-generating parser.
This might be good enough reason to try the
StAXBuilder JDOM class I contributed earlier. :-)
In fact, if there's enough interest in this (and clear
rules of functionality needed), I could write it to
use as a sample for where StAX parsing may be a good
As a side-note, one of features I implemented for "my"
StAX parser is the ability to override declared DTD
with an app-specified one.
This is mostly intended for testing and debugging
(it's used by 'validatexml' tool I wrote, to allow
validating any document against any DTD), but could be
used for attaching "external" DTD to be used with
documents that specify no DTD; and as a result, one
would then get proper notifications on ignorable
whitespace (like SAX, StAX has means of classifying
text as ignorable white space, iff DTD validation is
Too bad there's no standard way of specifying such
a setting, so solution would be Woodstox-specific.
-+ Tatu +-
Do you Yahoo!?
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
More information about the jdom-interest