[jdom-interest] XML Element name Verifier is overly strict and
doesn't match current XML 1.0 REC
Leigh.Klotz at xerox.com
Thu Mar 26 10:30:15 PDT 2009
Thank you both for researching the issue and for getting the list back
From: Jason Hunter [mailto:jhunter at servlets.com]
Sent: Saturday, March 21, 2009 7:55 PM
To: jdom interest
Cc: Klotz, Leigh
Subject: Re: [jdom-interest] XML Element name Verifier is overly strict
and doesn't match current XML 1.0 REC
Note that this had a pretty good debate on xml-dev (while our list was
General consensus seems to be the current behavior is the lesser of two
On Mar 19, 2009, at 2:50 PM, Klotz, Leigh wrote:
> JDOM 1.1 won't create elements whose characters are in the following
> Unicode 0xFF41-0xFF5A (FULLWIDTH LATIN SMALL LETTER A to FULLWIDTH
> LATIN SMALL LETTER Z) Unicode 0xFF21-0xFF3A (FULLWIDTH LATIN CAPITAL
> LETTER A to FULLWIDTH LATIN CAPITAL LETTER Z)
> The JDOM 1.1 source for org.jdom.Verifier.isXMLLetter cites production
> 84 of the XML 1.0 Recommendation for its table of allowed characters.
> However, according to http://www.w3.org/TR/REC-xml/ the whole of
> Appendix B (which contains Production 84) is obsolete and is not used
> within the recommendation. The XML Rec instead uses production 
> for NameStartChar and  for NameChar.
> The productions at  and  are considerably smaller than those of
> Appendix B, and are more inclusive, providing for greater utility in
> I18N applications of XML.
> Furthermore, according to http://www.w3.org/TR/REC-xml/ Appendix J
> (Non-Normative), the characters I menition above are not only allowed,
> but encouraged for use in XML Names, because the Unicode ID_Start
> property and ID_Continue of these Unicode code points is True.
> The XML REC says:
> 1. The first character of any name should have a Unicode property
> of ID_Start, or else be '_' #x5F.
> 2. Characters other than the first should have a Unicode property
> of ID_Continue, or ...
> You can see that ID_Start and ID_Continue are True on the individual
> pages for the small letters here:
> I recommend that org.jdom.Verifier.isXMLLetter be updated to use
> production , [4a], and  of XML 1.0 Fifth Edition.
> It's quite likely that some of the other character class verifiers
> need updating as well, but I didn't examine them.
> To control your jdom-interest membership:
> youraddr at yourhost.com
More information about the jdom-interest