[jdom-interest] CDATA inconsistency

Elliotte Rusty Harold elharo at metalab.unc.edu
Sat Nov 2 08:22:01 PST 2002


At 11:08 PM -0800 11/1/02, Malachi de AElfweald wrote:
>It would be against XML spec to check the characters within the 
>CDATA, since the spec
>says that CDATA is "unparsed character data". Seems like parsing it 
>wouldn't fit the description, eh?
>

No, that's not quite true. there are a number of characters which 
cannot appear in a CDATA section. These include many C0 controls such 
as null and vertical tab, unmatched halves of surrogate pairs, and a 
few other undefined code points. The three character sequence ]]> is 
also illegal.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+



More information about the jdom-interest mailing list