[jdom-interest] StAXBuilder, removing (indentation) white space

Tatu Saloranta cowtowncoder at yahoo.com
Mon Dec 13 21:24:09 PST 2004

Based on earlier discussions on possibility of
removing white space used for indentation that some
xml writers produce, I decided to try to implement
that as part of the StAX-parser based builder (I
noticed some work has also been done on SAXBuilder
Find attached the results; modified StAXBuilder.java,
and one new class file, StAXTextModifier.java. Latter
defines simple API used by the builder, that allows
for more general modifications, and is specifically
used for purposes of indentation removal.
As usual, the classes are also available via StaxMisc


One can enabled simple heuristics spaced indentation
removal by doing:


which will then remove what it determines to be
indentation white space, as part of the build process;
this should be fairly efficient way of doing it (but
obviously its accuracy depends on exact formatting of
the doc read).
This functionality is implemented using an inner class
named StAXBuilder.IndentRemover; the class can be
extended to change the heuristics used (right now it's
just any non-CDATA all white space text segment that
starts with a linefeed, anywhere in the document).

I will also explain the idea behind StAXTextModifier
interface on another email for anyone interested.

-+ Tatu +-

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: StAXBuilder.java
Type: text/x-java
Size: 24166 bytes
Desc: StAXBuilder.java
Url : http://www.jdom.org/pipermail/jdom-interest/attachments/20041213/60763076/StAXBuilder.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: StAXTextModifier.java
Type: text/x-java
Size: 7746 bytes
Desc: StAXTextModifier.java
Url : http://www.jdom.org/pipermail/jdom-interest/attachments/20041213/60763076/StAXTextModifier.bin

More information about the jdom-interest mailing list