[jdom-interest] Merging text nodes

Elliotte Rusty Harold elharo at metalab.unc.edu
Sat Feb 16 20:54:55 PST 2002

At 10:45 PM -0600 2/16/02, Bradley S. Huffman wrote:

>In Element.getText() Text and CDATA are concatinated into a single String.
>What about EntityRef?  For example with
>     <title>Cats &amp; Dogs</title>
>the element.getText() yields "Cats  Dogs", not "Cats &amp; Dogs" which I
>would find more useful.

The problem is that the text isn't "Cats &amp; Dogs". It's "Cats & 
Dogs" and we have no way to get that. I agree, getText() does fail 
when faced with <title>Cats &amp; Dogs</title>, but honestly I think 
getText() fails when faced with <title>Cats <em>and</em> Dogs</title> 
too. I don't like the current incarnation of this method at all, but 
I think EntityRef is the least of its troubles.

The assumption is that developers won't be using EntityRef for simple 
predefined things like &amp; but for more complex things like &copy;, 
&nbsp;, or &signature-file;. Currently the EntityRef class does not 
provide any means for us to store or retrieve the replacement text of 
an entity which may contain  large quantities of both text and markup 
in the general case.

The way DOM handles this is to allow the EntityRef to have children. 
I'm not sure we want to do that, but maybe we do in the future. It 
needs more thought and discussion.

>Would a method like Element.getText(Map) be useful for concatinating Text,
>CDATA, and EntityRef into a String?

At first hearing, it sounds too complicated and not really extensible 
in the long term.

| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |

More information about the jdom-interest mailing list