[jdom-interest] Facing problem reading comments data, need help
robin_rspvh at yahoo.com
Fri Jul 6 14:44:09 PDT 2007
Hi fellow members,
I'm working on a program to analyze web page structural similarity. The parser I have is able to work with JDOM and have been able to read html files and convert them into respective DOM tree structure.
But there are some web pages using "<!---" and JDOM sounded off stating that the data is not legal for a JDOM comment: Comment data cannot start with a hyphen, giving an IllegalDataException.
Actually I do not want comments to be read in as I'm primarily concerned with the structure of web page, tried searching through SAX features and property but I can't find a way to prevent the parser or JDOM from reading in comments.
Thus posting this to ask if anyone has a way out to do this? Another way I'm thinking of is to turn off the verifier so that the illegal comments can be read in and then I can filter them out later but don't seems to find the method to turn it off, does anyone know where is it in the javdoc?
Thanks in advance.
Send instant messages to your online friends http://uk.messenger.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the jdom-interest