[jdom-interest] special characters problem

manish.sharan at divlogic.com manish.sharan at divlogic.com
Thu Oct 30 10:41:55 PST 2003


I recently solved this kind of problem by enforcing charset encoding all theb 
way from JVM "file.encoding" option to using the charset encoding name whenever 
using any InputStreams to read external data .

The windows and Unix/Linux behaviorial difference with respect to sepcial 
characters is due to  the differing default charset encoding. 

Hope this helps.
-manish


Quoting Pramodh Peddi <peddip at contextmedia.com>:

> Hi,
> I am using JDOM Beta 8 version for XML parsing. we are happening to have lot
> of special characters (like registered marks, copyright symbols, trade
> marks, and other many funky chars). After building the document, the parser
> is converting the characters into "?" characters. This is what I am doing to
> build the document:
> 
> ****************************************************************************
> ************
> // Method to return a Document object given an xml String
> 
> public Document getDocumentfromString(String xmlString)
> 
> throws Exception {
> 
> Document schemaDoc = null;
> 
> SAXBuilder builder = new SAXBuilder(false);
> 
> String resultingXML = null;
> 
> if(!StringUtils.isEmpty(xmlString)){
> 
> 
> try{
> 
> schemaDoc =
> 
> builder.build(
> 
> new StringReader(xmlString));
> 
> }catch(JDOMException jdomex){
> 
> throw new Exception("Document could not be built: " + jdomex);
> 
> }
> 
> }else{
> 
> log.info("xmlString is null");
> 
> }
> 
> return schemaDoc;
> 
> }
> 
> ****************************************************************************
> ****
> 
> It is working fine on Windows (2000) machine, but spitting "?" symbols in
> place of special chars on UNIX machines.
> 
> I used to use schemaDoc = builder.build(new
> java.io.ByteArrayInputStream(xmlString.getBytes()));
> 
> to build the document in place of StringReader, but it was changing the
> encoding and throwing exception saying the special
> 
> chars don't belong to UTF-8. So, i changed it to StringReader - which
> doesn't throw exceptions but, converts the special chars to "?".
> 
> I also tried using builder.build(new
> java.io.ByteArrayInputStream(xmlString.getBytes(
> 
> "UTF-8"
> 
> )));
> 
> . But that din't help too.
> 
> 
> 
> Again, "?" are occuring only in UNIX machines, but works fine on Windows
> machines.
> 
> 
> 
> I would appreciate any help.
> 
> 
> 
> Thank you,
> 
> 
> 
> pramodh.
> 
> 
> 
> 
> 
> _______________________________________________
> To control your jdom-interest membership:
> http://lists.denveronline.net/mailman/options/jdom-
interest/youraddr at yourhost.com
> 






More information about the jdom-interest mailing list