[jdom-interest] Escaping entities problem.

Elliotte Rusty Harold elharo at metalab.unc.edu
Tue Aug 29 07:42:45 PDT 2000

At 3:03 PM +0100 8/29/00, Paul Lawton wrote:
>I have the following in my database
>    "testing- -para!"
>which should be a non-breaking space between the 2 words.
>However when I put this data into a JDOM document - the ampersand is escaped
>to produce the following element content:
>    "testing- -para!"

When you pass a java.lang.String into a method like setText() or 
setContent() JDOM assumes it's just that, a string, not a fragment of 
XML. For example, when you call

element.setContent(" ")

JDOM assumes you want to set the content to the string containing the 
six characters & # 1 6 0 and ;. It does not parse it to attempt to 
understand it as XML first. It's not XMLOutputter that's at fault 
here. The problem occurred when you first moved the content into 
JDOM. What you need to have given it was ""testing- -para!" where the 
middle space is in fact the non-breaking space.

If you have text data that you want to be interpreted as XML, you 
must first pass it through an XML parser before it goes into JDOM. 
This is what the SAXBuilder and DOMBuilder classes do. Off the top of 
my head I can't think of an easy way to parse fragments of an XML 
document rather than entire documents, but if entity references are 
the only issue for you, it shouldn't be too hard to pre-filter them.

| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
|                  The XML Bible (IDG Books, 1999)                   |
|              http://metalab.unc.edu/xml/books/bible/               |
|   http://www.amazon.com/exec/obidos/ISBN=0764532367/cafeaulaitA/   |
|  Read Cafe au Lait for Java News:  http://metalab.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://metalab.unc.edu/xml/     |

More information about the jdom-interest mailing list