<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=UTF-8" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

OK, now I'm a little confused.   I guess this is an XML question and

not really a JDOM question, but perhaps someone can explain it.<br>

<br>

Angela Amoateng wrote:

<blockquote cite="mid:20070521225023.1852uiurvcccso0k@impmail.kcl.ac.uk"

 type="cite"><br>

This is the code in my XML document (by the way, romaji is romanised

Japanese): <br>

  <br>

&lt;?xml version="1.0" encoding="UTF-8"?&gt; <br>

  <br>

&lt;dictionary&gt; <br>

   &lt;word&gt; <br>

       &lt;noun&gt; <br>

           &lt;english&gt;book&lt;/english&gt; <br>

           &lt;romaji&gt;hon&lt;/romaji&gt; <br>

           &lt;hiraganaSym&gt;ほん&lt;/hiraganaSym&gt; <br>

&lt;hiraganaNum&gt;&amp;#x307B;&amp;#x3093;&lt;/hiraganaNum&gt; <br>

       &lt;/noun&gt; <br>

</blockquote>

<br>

Where I get lost is in the &lt;hiriganaSym&gt; tag.   Those characters

inside are not part of any 8-bit code (ASCII, UTF-8 or whatever).  Java

has no problem with it because all String objects are built on unicode,

but what does the <u>encoding="UTF-8"</u> mean in the header if these

symbols can show up in the document?<br>

<br>

<pre class="moz-signature" cols="72">-- 

Alan Deikman

ZNYX Networks</pre>

</body>

</html>