I have now done further testing and it cannot be the file contents cause my 
workaround for the problem is to open the URL, read the stream contents 
into a StringBuffer and then use the SaxBuilder.build(String) method to 
parse the XML. This works works fine.
I use JDOM with Xerces and Xalan. Does Xerces get the encoding part right? 
Anyone knows?


>At 9:34 PM -0800 12/11/02, Jason Hunter wrote:
>>When you use a URL the underlying parser determines the encoding,
>>typically by looking at the declaration.
>Not necessarily. In an HTTP environment, the encoding specified by the 
>MIME type takes precedence over the encoding specified by the XML document 
>(though not all parsers get this right). If the HTTP header says the 
>document is UTF-8 and the encoding declaration says ISO 8859-1, then the 
>parser uses UTF-8. I have to double check this, but I also think that if 
>the HTTP header says the document is text/xml without any encoding, then 
>the parser picks US-ASCII regardless of what the encoding declaration 
>says. Again, only some parsers correctly implement the spec here.
