[jdom-interest] removing pcdata from jdom-Elements

Per Norrman per.norrman at austers.se
Sat Mar 12 08:37:50 PST 2005


Document.getDescendants returns an iterator that uses
other iterators internally, so I think you'll be
getting concurrent modification exceptions
with your approach.

An approach that works is to 'manually' traverse the tree
an 'rebuilding' the content for each element. Somthing like

     private static List makeList(Text text) {
         List l = new ArrayList();
         StringTokenizer st = new StringTokenizer(text.getText());
         while(st.hasMoreTokens()) {
             Element w = new Element("w");
         return l;

     private static void process(Element element) {
         List content = new ArrayList();
         for (Iterator i = element.removeContent().iterator(); i.hasNext();) {
             Object o = i.next();
             if (o instanceof Element) {
                 Element e = (Element) o;
             } else if (o instanceof Text){
             } else {


     public static void main(String[] args) throws Exception {
         String xml = "<s>someone said: <q>this sucks bigtime</q> and i agreed</s>";
         Document doc = new SAXBuilder().build(new StringReader(xml));
         new XMLOutputter().output(doc, System.out);

Kai Wörner skrev:
> Hi all,
> I want to do this to a XML-Document:
> (before:)
> <s>someone said: <q>this sucks bigtime</q> and i agreed</s>
> (after:)
> <s><w>someone</w><w>said:</w><q><w>this</w><w>sucks</w><w>bigtime</w></q><w>
> and</w><w>i</w><w>agreed</w></s>
> I thought I'll get all Elements via
> Iterator myI = doc.getDescendants(new ElementFilter());
> iterate through them, look for PCDATA via Element.getText, chop it with a
> StringTokenizer, add the Tokens as new <w>-Elements to the actual Element
> and get rid of the PCDATA itself. But how do I do this? Is there something
> like Element.removeContent(new onlyThePCDATAContentSparingElementsFilter())?
> Thanks
> Kai
