[jdom-interest] Verbose XHTML 1.1 Doctype

Stein Erik Berget seberget at escenic.com
Thu Mar 25 00:07:26 PST 2004

On Wed, 24 Mar 2004 18:47:47 +0000, David Dorward <david at dorward.me.uk> 

> I have a number of XHTML 1.1 documents, all conforming to the same
> template, which I want to extract some data from and then insert that
> data into different XHTML 1.1 documents.
> As a first step I am trying to read in a document and then print it out
> again without any modification. I've run into two issues:
> 1. It appears to be downloading the DTD from the w3c website - this
> takes time and bandwidth.
> 2. It seems to be expanding the Doctype line (example below).
> Is there any way to stop this? I'd like to leave the Doctype alone and
> save time on reading the DTD (I don't care about validation - that is
> handled elsewhere). I couldn't find anything looking at the docs, but I
> suspect this is due to not knowing what to look for.
Been there done this:

//path to find the catalog.xml file
String cat[] = {"file:///catalog.xml"};
XMLCatalogResolver resolver = new XMLCatalogResolver();

SAXBuilder builder = new SAXBuilder(true);

//build the document
Document document = builder.build(new 

You will need the following import as well...
import org.apache.xerces.util.XMLCatalogResolver;

This solution uses the catalog feature of xerces. The catalog.xml file I 
have looks like this:

<!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog 
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog" 
   <public publicId="-//W3C//DTD XHTML 1.1//EN" uri="xhtml11-flat.dtd" />

You can download the xhtml11-flat.dtd from the w3.org site with this url: 

By using the 'flat' variant you don't have to add all the other refereced 
dtds and parts.

By using something simular to this you still have a validated document, 
with great parsing speed.

Stein Erik Berget

More information about the jdom-interest mailing list