[jdom-interest] Unicode (UTF-16) problem with the parse

Jason Hunter jhunter at acm.org
Wed Dec 5 19:45:54 PST 2001

Handling the encoding is the job of the parser, not the Java
programmer.  Make sure the document says 'encoding="UTF-16"' in its
declaration, and the parser will read it properly.


> "Defeng Ma (Holyrood)" wrote:
> Hi everyone,
> recently I have developed a project with JDOM. Everything is fine
> until I got the request from the greek partner to support the Greek
> character in the Java application. After several testing and
> discussion with some people in Greek, we find out that I need use
> UTF-16 encoding system. The java application works fine with UTF-16
> encoding. But I got the problem with the JDOM parse.
> I used the Java application to capute the Greek characters in the user
> interface, and use XMLOutputter to write the xmlfile with the encoding
> UTF-16. However, when I use JDOM parse to read the file back, I got
> the problem.
> Here is my code to read the xml back:
>     // code start here, fname is the name of the xml file which is
> created by the XMLOutputer
>              SAXBuilder builder = new SAXBuilder();
>              FileInputStream fis = new FileInputStream(fname);
>              InputStreamReader isr = new InputStreamReader(fis,
> "UTF-16");
>              Document anotherDocument = builder.build(isr);
>              return anotherDocument;
>     // code end
> When try to run it, I got the following error message:
> // error message here
>     org.jdom.JDOMException: Error on line 1: Character conversion
> error: "Missing byte-order mark" (line         number may be too low).
>         at org.jdom.input.SAXBuilder.build(SAXBuilder.java:300)
>         at org.jdom.input.SAXBuilder.build(SAXBuilder.java:650)
>         at qdtJDom.readDocument(qdtJDom.java:134)
>         at qMetaJDom.<init>(qMetaJDom.java:33)
> // error message end
> Anyone has any idea how to solve this problem? By the way, when I
> changed the encoding system to UTF-8, the XMLoutputter and SAXBuilder
> can works without any error message, but all Greek characters will be
> replaced by the ??????.
> thanks in advance for any kind of tips,
> Defeng
> University of Edinburgh

More information about the jdom-interest mailing list