[jdom-interest] Equality semantics

Elliotte Rusty Harold elharo at metalab.unc.edu
Sun Apr 28 07:45:31 PDT 2002


I've been thinking about the recent change to the equals() method for 
DocType, and equality semantics in JDOM, and I think I've now figured 
out why they're justified philosophically. Jaosn, Brett, and others 
may have already been thinking this way but I wasn't, and now that I 
am I'm a lot happier with how JDOM handles equals(). No changes are 
proposed. I just wanted to put this on the record in case it helped 
anyone else understand what we're doing. It's also a very different 
reasoning than is currently listed in the FAQ, which focuses mostly 
on implementation details. I'm going to argue that JDOM does the 
right thing philosophically.

Consider this XML document:

<root>
   <test>A</test>
   <test>A</test>
</root>

JDOM represents the two test elements as two separate, unequal 
Element objects. And this is correct. They are not equal and the 
reason is that order and position matter in XML documents. Two nodes 
that are otherwise the same but that appear in different places in 
the document are significantly different.

Indeed, the semantic meaning of an element may well depend on its 
parent. For example, consider this document:

<root>
   <person><name>Indiana</name></person>
   <state><name>Indiana</name></state>
</root>

The two name elements are different because one is the name of a 
person and the other is the name of a state. JDOM should not report 
that they are the same.

Here's another example adapted from the MathML spec:


<mrow>
   <mrow>
     <msup>
       <mi>x</mi>
       <mn>4</mn>
     </msup>
     <mo>+</mo>
     <mrow>
       <mn>4</mn>
       <mo>&InvisibleTimes;</mo>
       <mi>x</mi>
     </mrow>
     <mo>+</mo>
     <mn>4</mn>
   </mrow>
   <mo>=</mo>
   <mn>0</mn>
</mrow>

The three <mn>4</mn> elements are all different because of their context.

In some cases the order may also affect the semantic meaning, even 
when the parent is the same. For example consider these elements:

<Person>
   <name>Madonna</name>
   <name>Louise</name>
   <name>Ciccone</name>
</Person>
<Person>
   <name>Humbert</name>
   <name>H.</name>
   <name>Humbert</name>
</Person>

Whether a name element is a first, middle, or last name depends on 
its order within its siblings. So even when we know that two elements 
are internally, byte-for-byte identical, this does not imply that 
they are equal.

JDOM never represents the same element twice. That is, there is never 
a single node in an XML document, for which there are two objects, 
both of which represent the same node.  Consequently, == is the only 
test for equality that makes sense in the most general case.

-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|             http://www.cafeconleche.org/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+



More information about the jdom-interest mailing list