[jdom-interest] JDOM Entity references for special characters

Jason Hunter jhunter at xquery.com
Thu May 26 11:29:02 PDT 2005


Your characters are being written out in UTF-8.  It's most definitely 
not "gibberish".  :)  It just looks like it when your viewer isn't UTF-8 
aware.  You can change the output encoding to ASCII and then chars > 127 
will be auto-escaped.  Or you can use Latin-1 which will escape > 255. 
If you want to control the escaping behavior to write special entities 
for certain chars, you'll need to subclass XMLOutputter.

-jh-

Sven Deckers wrote:

> Hello,
>  
> I've encountered the following problem in a project I'm working on :
>  
> 1. When I parse an XML-file with the following special characters :  
> “ test valid zone ”
>     they obviously have to be referenced in the DTD, otherwise the JDOM 
> parser will give an exception.
>  
> 2. In the DTD I've included the following W3C .ent-files : 
> "xhtml-lat1.ent", "xhtml-special.ent" and "xhtml-symbol.ent" as 
> recommended in "XML in a nutshell" (O'Reilly)
>  
> 3. The file now is correct according to XMLSpy
>  
> 4. When I start parsing it with JDOM, in order to generate 
> INSERT-statements for a MySQL Database, these Entities are *translated* 
> to gibberish : “ test invalid zone �
>  
> 5. When I explicitly put them in my DTD : <!ENTITY ldquo "[ldquo ]"> 
> they are translated to [ldquo ] in the INSERT statement.
>  
> My question : how can I tell JDOM to just check if the Entity is ok, and 
> then DON'T translate it.
>  
> Thank you on beforehand,
>  
> Sven.
>  
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> To control your jdom-interest membership:
> http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com


More information about the jdom-interest mailing list