[jdom-interest] DOCTYPE still giving me the worst headache!

Elliotte Rusty Harold elharo at metalab.unc.edu
Wed Jan 30 04:49:49 PST 2002


At 7:33 PM -0600 1/29/02, Jason Long wrote:


>org.jdom.JDOMException: Error on line 2 of document
>file:/G:/www.che.com/companylistings/10.html: White space is required
>between the public identifier and the system identifier.
>
>This is the line JDOM is comlpaining about(I am using Beta 7).
>
><!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
>

Your documents do not have system IDs. This is malformed according 
tot he XML 1.0 spec.

>JDOM had no problem writing this to disk.  I cannot understand why it cannot
>read it back again.  I would appreciate any help with this matter.  I have
>posted this problem before and seen others postings as well, but I have yet
>to see anything that will help me.  In the past I used a regex to strip this
>out of the file before sending to JDOM, but I do not like this approach at
>all.
>

If JDOM's allowing to create documents like this, then that's a bug 
that needs to be fixed.

A quick look at the code of DocType shows that there's no test 
whether the public or system ID is null. This should probably be 
fixed.

More importantly, there's a design flaw in this class. The empty 
string is a legal public ID. However, we don't distinguish between 
the empty string as a public ID and no public ID at all. We should 
probably allow null public IDs to indicate that there is only a 
system ID. That is, change


   public DocType(String elementName, String systemID) {
         this(elementName, "", systemID);
     }

to


   public DocType(String elementName, String systemID) {
         this(elementName, null, systemID);
     }

The system ID can also legally be the empty string so we also need to change


     public DocType(String elementName) {
         this(elementName, "", "");
     }

to


     public DocType(String elementName) {
         this(elementName, null, null);
     }

The next step is to change the XMLOutputter and SAXBuilder and 
DOMBuilder logic to use null to indicate no public ID and no system 
ID instead of the empty string. More on that shortly.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
+----------------------------------+---------------------------------+



More information about the jdom-interest mailing list