[jdom-interest] don't validate comments

Ian Lea ian at digimem.net
Thu Dec 5 08:29:54 PST 2002

As someone said, it is not just the comments that will cause you
problems.  If you need to parse HTML as opposed to XML you might find
JTidy (sourceforge.net/projects/jtidy) useful.

ian at digimem.net

> ...
> However, I think it should be possible to take a HTML document with 
> some incorrect comment content and extract the content of the 
> document, ignoring the comments. Isn't it the content of the 
> document which is of interest, not the comments? And as you can see, 
> even such official governmental sites have non-valid HTML comments.
> In my opinion we should provide the option not to regard the 
> comment's content. Don't you agree?

Searchable personal storage and archiving from http://www.digimem.net/

More information about the jdom-interest mailing list