[jdom-interest] UNDECLARED_ATTRIBUTE

Elliotte Rusty Harold elharo at metalab.unc.edu
Fri Apr 19 04:50:08 PDT 2002


At 10:22 AM +0200 4/19/02, Laurent Bihanic wrote:
>Hi, Elliotte,
>
>I do not agree with you: the sentence you are refering to is part of 
>the "Attribute-Value Normalization" section.  Thus (IMHO) this 
>sentence only applies to the way parsers should normalize the value 
>of undeclared attribute, not the way they report the attribute type 
>to the application.
>

After looking carefully at the XML spec, I now agree with you. There 
is such a thing as an attribute that has no declared type, and it 
thus makes sense for JDOM to have an UNDECLARED_ATTRIBUTE pseudo-type.

However, I'm still bothered that SAX doesn't agree. See 
http://www.saxproject.org/apidoc/org/xml/sax/Attributes.html#getType(int) 
which states:

The attribute type is one of the strings "CDATA", "ID", "IDREF", 
"IDREFS", "NMTOKEN", "NMTOKENS", "ENTITY", "ENTITIES", or "NOTATION" 
(always in upper case).

If the parser has not read a declaration for the attribute, or if the 
parser does not report attribute types, then it must return the value 
"CDATA" as stated in the XML 1.0 Recommentation (clause 3.3.3, 
"Attribute-Value Normalization").

Thus a SAX parser will never report an undeclared attribute to JDOM.

>Also, at the time I added the attribute type support, I did not want 
>to use CDATA as default because of the problems I encountered with 
>enumerated types. Have a look at SAXHandler's getAttributeType: 
>Without the current hack, using CDATA as default may lead to report 
>as CDATA some ENUMERATED attributes if someone uses a parser as 
>weird as Xerces!!!
>

That's certainly messy. The real issue here seems to be that not all 
parsers comply with the SAX2 specification with respect to attribute 
types. I'm not sure that's relevant here, however. Consider this code 
from the private getAttributeType() method:

     private int getAttributeType(String typeName) {
         Integer type = (Integer)(attrNameToTypeMap.get(typeName));
         if (type == null) {
             if (typeName != null && typeName.length() > 0 &&
                 typeName.charAt(0) == '(') {
                 // Xerces 1.4.X reports attributes of enumerated type with
                 // a type string equals to the enumeration definition, i.e.
                 // starting with an parenthesis.
                 return Attribute.ENUMERATED_ATTRIBUTE;
             }
             else {
                 return Attribute.UNDECLARED_ATTRIBUTE;
             }
         } else {
             return type.intValue();
         }
     }


You're not actually returning UNDECLARED_ATTRIBUTE for an undeclared 
attribute because SAX will not specify null as an attribute type. 
What this does is return UNDECLARED_ATTRIBUTE in the event that an 
unknown, non-standard attribute type is encountered. A better name 
here would be "NONSTANDARD_ATTRIBUTE" or something like that.

-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|             http://www.cafeconleche.org/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+



More information about the jdom-interest mailing list