[jdom-interest] how to parse UTF-8 encoded files with JDOM b7 and Xerces

Aron Kramlik aron.kramlik at itouch.com.au
Mon Dec 2 05:10:58 PST 2002


I have an XML file that is encoded in UTF-8 and it has a mixture of English
and Cantonese characters.  When I parse the file using SAXParser I don't
get the real Cantonese characters back but rather just questions marks (?).

If I set the file.encoding option of the JVM to be UTF-8 it works fine.

Now, my question is, how can I control the file encoding per parse within a
JVM if it would not be suitable for me to change this global setting from
default one?

Thanks for your time and help,

Aron Kramlik.

More information about the jdom-interest mailing list