[jdom-interest] Re: Why does passing a document through a socket sometimes hang the parser?

Joseph Bowbeer jozart at csi.com
Fri Feb 2 00:03:11 PST 2001


On 1 Feb 2001 "Tony M Smith" <tonyms at compuserve.com> wrote:
>
> A relevant problem, though it is not necessarily the cause here,
> is that the Xerces SAX parser doesn't work properly on socket input.
> My suggested workarounds are in the updated FAQ -
>
> http://xml.apache.org/xerces-j/faq-write.html#faq-11
>
> This solution requires throwing a "finished" exception on
> logical EOF, rather than trying to find a physical EOF,
> which obviously isn't there for a socket. I've had to modify
> org.jdom.SAXBuilder as follows:
>

It might be more convenient to implement your workaround in an XMLFilter and
install it before you build the document:

    SAXBuilder builder = new SAXBuilder();
    builder.setXMLFilter(new SocketFilter());
    Document doc = builder.build(in);

The SocketFilter implementation would look as follows, though I'd probably
add a reset() method, too, and then override startDocument() and have it
call reset().

  public class SocketFilter extends XMLFilterImpl {

    /** Variables for discovering logical end of document */
    private int level;
    private String firstTag;

    public void startElement(String uri, String localName,
      String qName, Attributes atts) throws SAXException {

      if (level++ == 0)
        firstTag = localName;
      super.startElement(uri, localName, qName, atts);
    }

    public void endElement(String uri, String localName,
      String qName) throws SAXException {

      if ((--level == 0) || firstTag.equals(localName))
        throw new SAXException("Finished");
      super.endElement(uri, localName, qName);
    }
  }

By the way, the statement in the Xerces faq about it reading-in the entire
document by default is not accurate.  The default reader is buffered, but
the chunks aren't *that* big.

Also by the way: If the document is posted with a content-length, then that
can be used to generate the EOF signal: wrap the input stream in a
CroppedInputStream (or some other concoction) that will return EOF when an
an attempt is made to read past the content length.  This stream wrapper is
also where you'd want to override the close() method in order to protect the
underlying socket.

--
Joe Bowbeer







More information about the jdom-interest mailing list