[jdom-interest] XML escaping and unescaping

John Caron caron at unidata.ucar.edu
Mon Dec 6 17:26:05 PST 2004


David Wall wrote:

>You are trying to put a CTRL-A character in, and that's not a legal value
>for XML data.  Are you trying to store a "1" character?  If so, just use
>Element.setText("1");
>  
>
my problem is that im trying to send an arbitrary string across the wire 
with XML. Sometimes its got weird stuff in it, but id like it to show up 
on the other side anyway just as it is. Im unclear how to serialize it. 
I thought that was similar to your previous question.


>David
>
>----- Original Message ----- 
>From: "John Caron" <caron at unidata.ucar.edu>
>To: <jdom-interest at jdom.org>
>Sent: Monday, December 06, 2004 4:37 PM
>Subject: Re: [jdom-interest] XML escaping and unescaping
>
>
>  
>
>>I'm unsure if I have a basic misunderstanding, but its easy enough to
>>create a String in Java like
>>
>>    char[] cdata = new char[] { (char) 1 };
>>    String s = new String( cdata);
>>
>>     Element.setText(s);
>>
>>or
>>
>>    Element.setText( XMLOutputter.escapeElementEntities(s))
>>
>>that gets an exception like:
>>
>>      The data "" is not legal for a JDOM character content: 0x1 is not
>>a legal XML character
>>
>>I guess that means that the String has an illegal Unicode encoding  or
>>something ? Or maybe I dont know how to extract the Unicode
>>representation of the String ?
>>
>>
>>Jason Hunter wrote:
>>
>>    
>>
>>>XMLOutputter has escapeElementEntities() and escapeAttributeEntities()
>>>that do what you want and have a pluggaable EscapeStrategy to handle
>>>characters outside the selected output encoding.  We don't have code
>>>to do the reverse as we rely on XML parsers for that.
>>>
>>>-jh-
>>>
>>>d.wall at computer.org wrote:
>>>
>>>      
>>>
>>>>Does JDOM come with any utility routines that will take a String and
>>>>make it XML safe?  And also a routine that takes an XML safe encoding
>>>>and converts it back to a regular String?
>>>>
>>>>i.e.
>>>>
>>>>String -> XML Safe string -> String
>>>>
>>>>"This" -> "This"  -> "This"  (no change needed)
>>>>"4+3<4+4" -> "4+3&lt;4+4" -> "4+3<4+4"
>>>>
>>>>I only ask because I have some basic routines that do this, but they
>>>>only map the following:
>>>>
>>>> >   &gt;
>>>><   &lt;
>>>>&   &amp;
>>>>'     &apos;
>>>>"    &quot;
>>>>
>>>>It currently doesn't deal with escaped character codes like &#039; It
>>>>seems that putting data into XML and getting it back from XML is so
>>>>common that there must be a general routine to do this rather than
>>>>having to rely on my own implementation.
>>>>
>>>>Thanks,
>>>>David
>>>>
>>>>_______________________________________________
>>>>To control your jdom-interest membership:
>>>>http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>>>>
>>>>        
>>>>
>>>_______________________________________________
>>>To control your jdom-interest membership:
>>>http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>>>      
>>>
>>
>>_______________________________________________
>>To control your jdom-interest membership:
>>http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>>    
>>
>
>_______________________________________________
>To control your jdom-interest membership:
>http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com
>  
>




More information about the jdom-interest mailing list