[jdom-interest] encoding="MS950"

Stuart stuart at truetel.com
Tue Nov 23 02:41:31 PST 2004


All,

Regarding the encoding problem I initially thought I may need to install
Chinese version of windows (I still may try this at some point) however with
jdk1.4 and jdk1.3 I am able to use MS950 encoding as follows:

			String test = "hello"; //hack
			byte[] bytes = test.getBytes("MS950"); //hack

If I make up some unknown encoding it will fail (as expected):

			String test = "hello"; //hack
			byte[] bytes = test.getBytes("dhhfg"); //hack

	java.io.UnsupportedEncodingException: dhhfg
        at sun.io.Converters.getConverterClass(Converters.java:125)
        at sun.io.Converters.newConverter(Converters.java:156)
        at
sun.io.CharToByteConverter.getConverter(CharToByteConverter.java:64)
        at java.lang.StringCoding.encode(StringCoding.java:368)
        at java.lang.String.getBytes(String.java:591)

What is different about JDOM or the SAXBuilder?  The XML (VXML) document I
am testing with is as follows (note: I just added the encoding attribute
myself i.e. the document was not created using any Chinese input and it does
not need the encoding attribute.  The 'real' xml documents I am parsing are
much more complicated but the problem is the same):

<?xml version="1.0" encoding="MS950"?>
<vxml version="1.0">
	<form id="hello">
		<block>Hello World!</block>
	</form>
</vxml>

Any help will be much appreciated.

Regards,

Stuart

-----Original Message-----
From: Stuart [mailto:stuart at truetel.com]
Sent: Tuesday, November 23, 2004 12:36 AM
To: jdom-interest at jdom.org
Subject: RE: [jdom-interest] encoding="MS950"


All,

Sorry for the multiple postings but I think I was wrong about MS950 not
being supported in jdk1.4.  I also discovered the following entry in the
jdk1.4 information:

x-windows-950 MS950 Windows Traditional Chinese

Not sure what I am doing wrong.  *8-(

Regards,

Stuart


-----Original Message-----
From: Stuart [mailto:stuart at truetel.com]
Sent: Monday, November 22, 2004 10:51 PM
To: Elliotte Harold
Cc: jdom-interest at jdom.org
Subject: RE: [jdom-interest] encoding="MS950"


All,

I originally posted a question about the SAXBuilder supporting the encoding
format MS950.  I recieved a reply stating that the encoding format support
is determined by the JDK (not the parser).  I also found that MS950 no
longer appears supported under jdk1.4 BUT in jdk1.3 it seems to be supported
(http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html).  I
downladed the international jre for jdk1.3.1_13 but I still get the encoding
not supported error:

STUART$java -version
java version "1.3.1_13"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_13-b03)
Java HotSpot(TM) Client VM (build 1.3.1_13-b03, mixed mode)

Here is the error I am getting:

org.jdom.input.JDOMParseException: Error on line 0: The encoding "MS950" is
not
supported.
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:468)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:810)
        at org.jdom.input.SAXBuilder.build(SAXBuilder.java:789)
	  ...

Do I need to do something in order to 'enable' the internation support?  I
opened the i18n.jar and inside could see a class called
CharToByteMS950.class.

Also is there a way of disabling the encoding check (basically just ignore
this field and parse the rest of the document)?

Regards,

Stuart




More information about the jdom-interest mailing list