[jdom-interest] Proposal: JDOM event based processing

Tue Nov 7 04:19:03 PST 2000

From: "Randall J. Parr" <RParr at TemporalArts.COM>
> James Strachan wrote:
>
[snip]

> I, in general, agree this would be very useful. Saxon has a mechanism in
its Java
> API that is somewhat like this and I like it (I just can't get it to work
very
> well).
>
> I would like to point out though that, for my use, when doing event
processing I
> almost always need to handle the startElement and endElement in the order
they are
> encountered.
>
> For example when I encounter <TABLE name="customer"> I have to open/verify
a
> connection to the database and initialize my metadata. When I encounter
</TABLE> I
> have to commit/rollback the transaction and close/release the database
connection.
>
> Even more simply, when I encounter <TABLE ... > I want to output open a
new file and
> output <TABLE ... >, then I encounter a lot of <ROW> ... </ROW> elements
(each of
> which I massage, output, and then discard), finally when I encounter
</TABLE> I want
> to output the </TABLE> close that file and be done.
>
> Your interface forces me to treat each as a start OR an end element. Maybe
> ElementHandler should be more like:
>
> package org.jdom;
> public interface ElementHandler {
>     public void startElement( ... )
>     public void endElement( ... )
> }

Thanks Randall.

Yes I agree that its nice to know sometimes that the start or end has
occurred. However I suppose these are 'sub element' events - the kind of
things that SAX has been designed for.

I'm tempted to still keep the simple interface

 public interface ElementHandler {
     public void handle( Element element );
 }

for processing 'whole' elements and element trees.

An additional sub-element handler could be useful.

 public interface SubElementHandler {
     public void onStart( Element element );
     public void onEnd( Element element );
 }

But this seems to be too close to the problem space SAX is trying to
tackle - which makes me think that in those sub-element conditions we should
be using SAX directly rather than introducing another new interface.

Another way of looking at the problem could be that we just implement these
start & end element semantics using ElementHandler.

To handle the start <TABLE> example you gave, you could just use lazy
contruction in your RowElementHandler. i.e. if the first row element is
being processed, open a connection / file  / whatever.

To handle the end <TABLE> example, we could have some way of specifying that
a TableElementHandler does not require any child elements. Afterall implicit
in the ElementHandler semantics is that the element has ended before the
handler is called. So we just want to filter out the <ROW> elements from our
TableElementHandler.

So we may use (say) XPath to find the root of the sub-document tree to
build, we may use another XPath expression to determine how deep the tree
should be. In the 'end element' use case we will probably want an empty
tree. The default case is probably 'from the sub-document tree root
downwards'.

What are other peoples thoughts on this? Should we go sub-element events or
keep them in SAX?

J.

James Strachan
=============
email: james at metastuff.com
web: http://www.metastuff.com

If you are not the addressee of this confidential e-mail and any
attachments, please delete it and inform the sender; unauthorised
redistribution or publication is prohibited. Views expressed are those of
the author and do not necessarily represent those of Citria Limited.