Then, what you are referring to is <!ENTITY bob  "Bob Weaver">.  And that part of the spec has nothing to do with the JDOM content in a Text of CDATA node.<div><br></div><div>Wilf</div><div><br><div><div><br>

<div class="gmail_quote">On Fri, Sep 7, 2012 at 6:05 AM, Michael Kay <span dir="ltr"><<a href="mailto:mike@saxonica.com" target="_blank">mike@saxonica.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000">

    No, that's all wrong. The contents of an unparsed entity are always

    an external resource, they are never part of a text or attribute

    node. Parsed entities do become part of the content, but they must

    always use the XML character set.<br>

    <br>

    Michael Kay<br>

    Saxonica<br>

    <br>

    <div>On 07/09/2012 13:10, Canadian Wilf

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div>According to the xml 1.1 spec:</div>

      <div><br>

      </div>

      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><a name="139a0ed3ca62ae5a_sec-physical-struct" style="font-family:arial,helvetica,sans-serif">4 Physical

          Structures ...</a><span style="font-family:arial,helvetica,sans-serif"><br>

        </span><span style="font-family:arial,helvetica,sans-serif">[</span><a name="139a0ed3ca62ae5a_dt-unparsed" title="Unparsed Entity" style="font-family:arial,helvetica,sans-serif">Definition</a><span style="font-family:arial,helvetica,sans-serif">: An </span><b style="font-family:arial,helvetica,sans-serif">unparsed entity</b><span style="font-family:arial,helvetica,sans-serif"> is a resource

          whose contents may or may not be </span><a title="Text" href="http://www.w3.org/TR/xml11/#dt-text" style="font-family:arial,helvetica,sans-serif;color:rgb(102,0,153)" target="_blank">text</a><span style="font-family:arial,helvetica,sans-serif">, and if text,

          may be other than XML. Each unparsed entity has an associated </span><a title="Notation" href="http://www.w3.org/TR/xml11/#dt-notation" style="font-family:arial,helvetica,sans-serif;color:rgb(102,0,153)" target="_blank">notation</a><span style="font-family:arial,helvetica,sans-serif">, identified by

          name. Beyond a requirement that an XML processor make the

          identifiers for the entity and notation available to the

          application, XML places no constraints on the contents of

          unparsed entities.]</span></blockquote>

      <div> </div>

      <div><br>

      </div>

      <div>AND </div>

      <div><br>

      </div>

      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span style="font-family:arial,helvetica,sans-serif">Entities may be

          either parsed or unparsed. [</span><a name="139a0ed3ca62ae5a_dt-parsedent" title="Text Entity" style="font-family:arial,helvetica,sans-serif">Definition</a><span style="font-family:arial,helvetica,sans-serif">: The contents

          of a </span><b style="font-family:arial,helvetica,sans-serif">parsed

          entity</b><span style="font-family:arial,helvetica,sans-serif"> are

          referred to as its </span><a title="Replacement Text" href="http://www.w3.org/TR/xml11/#dt-repltext" style="font-family:arial,helvetica,sans-serif;color:rgb(102,0,153)" target="_blank">replacement

          text</a><span style="font-family:arial,helvetica,sans-serif">;

          this </span><a title="Text" href="http://www.w3.org/TR/xml11/#dt-text" style="font-family:arial,helvetica,sans-serif;color:rgb(102,0,153)" target="_blank">text</a><span style="font-family:arial,helvetica,sans-serif"> is considered

          an integral part of the document.]</span></blockquote>

      <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><font face="arial, helvetica, sans-serif">[<a name="139a0ed3ca62ae5a_dt-unparsed" title="Unparsed Entity">Definition</a>:

          An <b>unparsed entity</b> is a resource whose contents may or

          may not be <a title="Text" href="http://www.w3.org/TR/xml11/#dt-text" style="color:rgb(102,0,153);background-color:transparent;background-repeat:initial initial" target="_blank">text</a>, and if text, may be other than XML. Each

          unparsed entity has an associated <a title="Notation" href="http://www.w3.org/TR/xml11/#dt-notation" style="color:rgb(102,0,153);background-color:transparent;background-repeat:initial initial" target="_blank">notation</a>, identified by name. Beyond a

          requirement that an XML processor make the identifiers for the

          entity and notation available to the application, XML places

          no constraints on the contents of unparsed entities.]<br>

          Parsed entities are invoked by name using entity references;

          unparsed entities by name, given in the value of <b>ENTITY</b> or <b>ENTITIES</b> attributes.</font></blockquote>

      <font face="arial, helvetica, sans-serif">

        <div>

          <font face="arial, helvetica, sans-serif"><br>

          </font></div>

        <div><font face="arial, helvetica, sans-serif"><br>

          </font></div>

        In the current JDOM version, Element method setText(string) and

        also addContent(CDATA) refuses text that contains illegal

        characters. It is treating the data provided as 'parsed' when it

        should by the spec be treating it as free content.</font>

      <div>

        <font face="arial, helvetica, sans-serif"><br>

        </font></div>

      <div><font face="arial, helvetica, sans-serif">I understand:</font></div>

      <div><font face="arial, helvetica, sans-serif"><br>

        </font></div>

      <div>

        <div class="gmail_quote">

          <font face="arial, helvetica, sans-serif">1) The xml 1.1 spec

            defines a parsed entity as its 'replacement text'.</font></div>

        <div class="gmail_quote"><font face="arial, helvetica,

            sans-serif"><br>

          </font></div>

        <div class="gmail_quote">

          <font face="arial, helvetica, sans-serif">2) R</font>eplacement

          text' would refer to the actual textual makeup of a serialized

          Element, not the data an Element holds in a Text content

          element</div>

        <div class="gmail_quote">

          <br>

        </div>

        <div class="gmail_quote"><br>

        </div>

        <div class="gmail_quote">Then, if the above is true, the current

          implementation is actually wrong to verify data.</div>

        <div class="gmail_quote"><br>

        </div>

        <div class="gmail_quote">

          I propose that JDOM stop verifying data set as Element text

          and CDATA and leave it to the xerces (or whatever) to make

          sure the document is proper 1.1.</div>

        <div class="gmail_quote"><br>

        </div>

        <div class="gmail_quote">Am I understanding everything

          correctly?</div>

        <div class="gmail_quote"><br>

        </div>

        <div class="gmail_quote">Thoughts?</div>

        <div class="gmail_quote"><br>

        </div>

        <div class="gmail_quote">---------- Forwarded message ----------</div>

        <div class="gmail_quote">From: <b class="gmail_sendername">Canadian

            Wilf</b> <span dir="ltr"><<a href="mailto:canwilf@gmail.com" target="_blank">canwilf@gmail.com</a>></span><br>

          Date: Thu, Sep 6, 2012 at 9:52 PM<br>

          Subject: XML 1.1 -- Please stab me with a dull knife and

          trample my dead body<br>

          To: <a href="mailto:jdom-interest@jdom.org" target="_blank">jdom-interest@jdom.org</a><br>

          <br>

          <br>

          <div>Hi All,</div>

          <div>

            <br>

          </div>

          <div>I just learned that in order to safely use JDOM2, I will

            need to sanitize my Element .setText(string) so that the

            parsed data does not contain verboten characters under the

            XML 1.1 spec.</div>

          <div><br>

          </div>

          <div>I have an ascii processor and it needs to be able to use

            xml as a document format. Unfortunately, not all ascii is

            allowed in an Element text.</div>

          <div><br>

          </div>

          <div>Stab me with a dull knife and trample my dead body. But

            ..... please please please don't make me sanitize all my

            data before putting it into XML Elements.</div>

          <div><br>

          </div>

          <div>1) It makes my programming task much more cumbersome

            because I must ensure not to feed any of the new verboten

            and doomed ascii/UTF-8 characters to store as xml text.</div>

          <br>

          <div>2) No one uses xml 1.1, do they?</div>

          <div><br>

          </div>

          <div>3) It slows down the parsing (a very small amount) with

            all the element text checking.</div>

          <div><br>

          </div>

          <div>Now that JDOM2 is xml 1.1 compatible, is there any

            turning back. Can this be undone? </div>

          <div><br>

          </div>

          <div>Does everyone understand that their software will bust if

            data provided as text is not adhering to the new standard?</div>

          <div><br>

          </div>

          <div>What about you? How do you deal with it when using the

            libraries?</div>

          <span><font color="#888888">

              <div><br>

              </div>

              <div>Wilf</div>

            </font></span></div>

        <br>

      </div>

      <br>

      <fieldset></fieldset>

      <br>

      <pre>_______________________________________________

To control your jdom-interest membership:

<a href="http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com" target="_blank">http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com</a></pre>

    </blockquote>

    <br>

  </div>

<br>_______________________________________________<br>

To control your jdom-interest membership:<br>

<a href="http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com" target="_blank">http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com</a><br></blockquote></div><br></div></div></div>