[jdom-interest] exception with xml:lang

Stacie Clark zeclarks at sprynet.com
Thu Aug 17 15:50:29 PDT 2000


I took Jason up on his challenge to find out why the special cases for xml:lang and
xml:space were not being used.  I sent my findings  from a non- member address, so
that posting is sitting on a mediators box somewhere. ;-) I have added that post to
the end of this mailing for continuity sake.


I have build something that works, although I'm still getting used to the code and so
am not sure if it has broken anything.
I modified Namespace and SAXHandler to get a build that didn't barf on xml:lang =
"en".  I also modified XMLOutputter to print correctly.  One of the modifications is
that instead of SAXHandler calling the Attribute constructor:
 public Attribute(String name, String prefix, String uri, String value) as
element.addAttribute(
                        new Attribute(name,
                                      prefix,
                                     getNamespaceURI(prefix),
                                      atts.getValue(i));
I changed it to
element.addAttribute(
                        new Attribute(name,
                                      atts.getValue(i),
                                      getNamespace(prefix)));
and used the three argument method. I read in the archives that it was planned to
remove the 4 argument method anyway, so this worked out well. To do this I needed to
replace
String SAXHandler.getNamespaceURI(prefix)
with
Namespace SAXHandler.getNamespace(prefix):

private Namespace getNamespace(String prefix) {
        // Cycle backwards and find URI
        for (int i=namespaces.size() - 1; i >= 0; i--) {
            Namespace ns = (Namespace)namespaces.get(i);
            if (ns.getPrefix().equals(prefix)) {
                return ns;
            }
        }
        // could be an xml namespace.  if so add it and return
        if(Namespace.isXmlNamespace(prefix, "")){
           Namespace ns = Namespace.getNamespace(Namespace.XML_PREFIX,
Namespace.XML_NAMESPACE_URI);
           namespaces.add(ns);
           return ns;
        }
        //oops no URI at all
        return Namespace.getNamespace("");
    }

(I also added Namespace.isXmlNamespace(prefix, "")) I didn't want to add the xml
namespace to every elements namespace list, so this way, only those that use it will
get it in their lists.

I have a question about the behavior if no URI can be found. I have the function
calling Namespace.getNamespace(""), well, because I wasn't sure what it should do. It
does cause an exception to be thrown which is appropriate.

Here are my modifications to the Namespace class:

I added to the declarations areas to instance a default namespace for xml:
public final class Namespace {

    /** Factory list of namespaces */
    private static HashMap namespaces;

    /** Factory list of mappings */
    private static HashMap mappings;

    /** Define a <code>Namespace</code> for when <i>not</i> in a namespace */
    public static final Namespace NO_NAMESPACE = new Namespace("", "");

    /**Define a <code>Namespace</code> for the xml namespace */
    public static final String XML_PREFIX = "xml";
    public static final String XML_NAMESPACE_URI =
"http://www.w3.org/XML/1998/namespace";
    public static final Namespace XML_NAMESPACE = new Namespace(XML_PREFIX,
XML_NAMESPACE_URI);

    /** The prefix mapped to this namespace */
    private String prefix;

    /** The URI for this namespace */
    private String uri;

    /**
     * <p>
     *  This static initializer acts as a factory contructor.
     *  It sets up storage and required initial values.
     * </p>
     *
     * XXX: Maybe this should be a singleton? The code would be cleaner (brett)
     */
    static {
        namespaces = new HashMap();
        mappings = new HashMap();

        // Add the "empty" namespace
        namespaces.put("", NO_NAMESPACE);
        mappings.put("", "");

        // Add the "xml" namespace
        namespaces.put(XML_NAMESPACE_URI, XML_NAMESPACE);
        mappings.put(XML_PREFIX, XML_NAMESPACE_URI);

    }

I added a clause to getNamespace(string, string) to get around the name verification:
The problem with the xml namespace is that it is only valid along with it's URI, (as
empty or as the correct URI)

    public static Namespace getNamespace(String prefix, String uri) {
        // Ensure proper naming
        String reason;
        if(isXmlNamespace(prefix, uri))
        {
            return (Namespace)namespaces.get(XML_NAMESPACE_URI);
        }
        if ((reason = Verifier.checkNamespacePrefix(prefix)) != null) {
            throw new IllegalNameException(prefix, "Namespace prefix", reason);
        }
        if ((reason = Verifier.checkNamespaceURI(uri)) != null) {
            throw new IllegalNameException(uri, "Namespace URI", reason);
        }

        // Housekeeping
        if ((prefix == null) || (prefix.trim().equals(""))) {
            prefix = "";
        }
        if ((uri == null) || (uri.trim().equals(""))) {
            uri = "";
        }

        // Unless the "empty" Namespace (no prefix and no URI), require a URI
        if ((!prefix.equals("")) && (uri.equals(""))) {
            throw new IllegalNameException("", "namespace",
                "Namespace URIs must be non-null and non-empty Strings.");
        }

        // Return existing namespace if found
        if (namespaces.containsKey(uri)) {
            return (Namespace)namespaces.get(uri);
        }

        // Ensure prefix uniqueness in non-default namespaces
        if (!prefix.equals("")) {
            int i = 0;
            String newPrefix = prefix;

            while (mappings.containsKey(newPrefix)) {
                newPrefix = newPrefix + i++;
            }
            prefix = newPrefix;

            // We really don't care to store all the default namespaces, so
            //   storing mappings here is OK
            mappings.put(prefix, uri);
        }

        // Finally, store and return
        Namespace ns = new Namespace(prefix, uri);
        namespaces.put(uri, ns);
        return ns;
    }
added a function (I think that this logic is reasonable)
    public static boolean isXmlNamespace(String prefix, String uri)
    {
       if(prefix.toLowerCase().equals(XML_PREFIX) && uri.equals(""))
       {
          return true;
       }
       if(uri.equals(XML_NAMESPACE_URI))
       {
          return true;
       }

       return false;
    }

I also added a guard clause to XMLOutputter.printNamespace(..)

protected void printNamespace(Namespace ns, Writer out) throws IOException {
        if(Namespace.isXmlNamespace(ns.getPrefix(), ns.getURI())){
           return;
        }

         out.write(" xmlns");
         if (!ns.getPrefix().equals("")) {
              out.write(":");
              out.write(ns.getPrefix());
           }
         out.write("=\"");
         out.write(ns.getURI());
         out.write("\"");

    }


Below is the lost post that explains why I did all this :-)

Ok.  This is the offending line:
<label href="xpointer(//item[@type='gpsi:computation.netIncome'])"
xml:lang="en">Net income</label>
This is the stack:

Breakpoint hit: org.jdom.Verifier.checkNamespacePrefix (Verifier:190)
main[1] locals
Method arguments:
Local variables:
  prefix = xml
  first = x
main[1] where
  [1] org.jdom.Verifier.checkNamespacePrefix (Verifier:190)
  [2] org.jdom.Namespace.getNamespace (Namespace:124)
  [3] org.jdom.Attribute.<init> (Attribute:120)
  [4] org.jdom.input.SAXHandler.startElement (SAXHandler:616)
  [5] org.apache.xerces.parsers.SAXParser.startElement (SAXParser:1371)
  [6] org.apache.xerces.validators.common.XMLValidator.callStartElement (XMLVali
dator:705)
  [7] org.apache.xerces.framework.XMLDocumentScanner.scanElement (XMLDocumentSca
nner:1852)
  [8] org.apache.xerces.framework.XMLDocumentScanner$ContentDispatcher.dispatch
(XMLDocumentScanner$ContentDispatcher:1233)
  [9] org.apache.xerces.framework.XMLDocumentScanner.parseSome (XMLDocumentScann
er:380)
  [10] org.apache.xerces.framework.XMLParser.parse (XMLParser:861)
  [11] org.jdom.input.SAXBuilder.build (SAXBuilder:258)
  [12] org.jdom.input.SAXBuilder.build (SAXBuilder:332)
  [13] org.jdom.input.SAXBuilder.build (SAXBuilder:313)
  [14] SAXBuilderDemo.testBuilder (SAXBuilderDemo:99)
  [15] SAXBuilderDemo.main (SAXBuilderDemo:137)
The actual call from SAXBuilder.startElement occurs in this block of code at
/*HERE*/

 if (!attName.startsWith("xmlns")) {
                name = attName;
                int attSplit;
                prefix = "";
                if ((attSplit = name.indexOf(":")) != -1) {
                    prefix = name.substring(0, attSplit);
                    name = name.substring(attSplit + 1);
                }
                // Only put attribute in namespace if there is a prefix
                if (prefix.equals("")) {
                    element.addAttribute(
                        new Attribute(name, atts.getValue(i)));
                } else {
 /*HERE*/        element.addAttribute(
                        new Attribute(name,
                                      prefix,
                                      getNamespaceURI(prefix),
                                      atts.getValue(i)));
                }

The value of "prefix" which is "xml" is used to construct a URI and the value is
passed into the Attribute constructor:

  public Attribute(String name, String prefix, String uri, String value) {
        this(name, value, Namespace.getNamespace(prefix, uri));
    }

Namespace.getNamespace then tries to construct a namespace.  However, the first
funciton it calls is Verifier.checkNamespacePrefix(prefix), which, as you see in
the code below does not allow a prefix of xml to be used.  Therefore the program
throws an exception before Verifier.checkAttributeName is ever called and that is
where the special cases code is.



Below, you will find the method from Verifier.  Notice that an xml prefix is
always wrong.
  public static final String checkNamespacePrefix(String prefix) {
        // Manually do rules, since URIs can be null or empty
        if ((prefix == null) || (prefix.equals(""))) {
            return null;
        }

        // Cannot start with a number
        char first = prefix.charAt(0);
        if (isXMLDigit(first)) {
            return "Namespace prefixes cannot begin with a number";
        }
        // Cannot start with a $
        if (first == '$') {
            return "Namespace prefixes cannot begin with a dollar sign ($)";
        }
        // Cannot start with a -
        if (first == '-') {
            return "Namespace prefixes cannot begin with a hyphen (-)";
        }
        // Cannot start with a .
        if (first == '.') {
            return "Namespace prefixes cannot begin with a period (.)";
        }
        // Cannot start with "xml" in any character case
        if (prefix.toLowerCase().startsWith("xml")) {
            return "Namespace prefixes cannot begin with " +
                   "\"xml\" in any combination of case";
        }

        // Ensure valid content
        for (int i=0, len = prefix.length(); i<len; i++) {
            char c = prefix.charAt(i);
            if (!isXMLNameCharacter(c)) {
                return "Namespace prefixes cannot contain the character \"" +
                        c + "\"";
            }
        }

        // No colons allowed
        if (prefix.indexOf(":") != -1) {
            return "Namespace prefixes cannot contain colons";
        }

        // If we got here, everything is OK
        return null;
    }

While one could write code in SAXBuilder to catch the special cases of xml:lang
and xml:space, the problem is really how the prefix "xml" is being handled in
Namespace.

Stacie






More information about the jdom-interest mailing list