[jdom-interest] don't validate comments

Elliotte Rusty Harold elharo at metalab.unc.edu
Thu Dec 5 07:53:24 PST 2002


At 4:05 PM +0100 12/5/02, Christian Peter wrote:


>Well, you are right that I don't quite know about the difference 
>between validation and well-formedness check (I thought the latter 
>is part of the first).

Well-formedness is a prerequisite for validity, but it is not the 
same thing. A document can be invalid but still well-formed.

>However, I think it should be possible to take a HTML document with 
>some incorrect comment content and extract the content of the 
>document, ignoring the comments. Isn't it the content of the 
>document which is of interest, not the comments? And as you can see, 
>even such official governmental sites have non-valid HTML comments.
>In my opinion we should provide the option not to regard the 
>comment's content. Don't you agree?

No. I don't. If it's not well-formed it isn't an XML document, 
period. In a malformed document there is no way to tell what is and 
is not a comment. All well-formedness rules must be adhere to without 
exception. Short of that you don't have an XML document.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+



More information about the jdom-interest mailing list