[jdom-interest] Facing problem reading comments data, need help

Paul Libbrecht paul at activemath.org
Fri Jul 6 15:03:12 PDT 2007

For the first approach, you could easily try:
- subclassing SAXBuilder
- override createContentHandler
- in a class that extends org.jdom.input.SAXHandler
- the latter of which overrides the comment method (or so) to not  
pass it to the parent class
I agree it sounds convoluted but it is fairly easy. If doubtful, you  
can see such an extension at:

hope that helps


Le 6 juil. 07 à 23:44, Robin Kwek a écrit :

> Hi fellow members,
> I'm working on a program to analyze web page structural similarity.  
> The parser I have is able to work with JDOM and have been able to  
> read html files and convert them into respective DOM tree structure.
> But there are some web pages using "<!---" and JDOM sounded off  
> stating that the data is not legal for a JDOM comment: Comment data  
> cannot start with a hyphen, giving an IllegalDataException.
> Actually I do not want comments to be read in as I'm primarily  
> concerned with the structure of web page, tried searching through  
> SAX features and property but I can't find a way to prevent the  
> parser or JDOM from reading in comments.
> Thus posting this to ask if anyone has a way out to do this?  
> Another way I'm thinking of is to turn off the verifier so that the  
> illegal comments can be read in and then I can filter them out  
> later but don't seems to find the method to turn it off, does  
> anyone know where is it in the javdoc?
> Thanks in advance.
> Send instant messages to your online friends http:// 
> uk.messenger.yahoo.com
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/ 
> youraddr at yourhost.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2203 bytes
Desc: not available
Url : http://www.jdom.org/pipermail/jdom-interest/attachments/20070707/eb9a757e/smime.bin

More information about the jdom-interest mailing list