[jdom-interest] XPath.selectNodes(doc) via Jaxen: nodes not in document-order

Elliotte Rusty Harold elharo at metalab.unc.edu
Mon Aug 16 13:55:51 PDT 2004


At 10:42 AM -0700 8/16/04, Jason Hunter wrote:

>Reverse axes are evaluated in reverse document order when applying 
>positional predicates.  That's for sure.  But the result of any 
>XPath is always a sequence of nodes in document order, and I can't 
>find anything in the XPath 1.0 spec to contradict this.

First, I think we need to note that we've wandered off from the 
original message, which is fine. But I just want to be clear that 
we're all agreed Jaxen is behaving buggily here. Though now that I 
think about it, I'm not so sure, as I'll explain.

>So given:
>
><a>
>   <b>
>     <c/>
>   </b>
></a>
>
>/a/b/c/ancestor::* returns the same as (/a, /a/b) not (/a/b, /a).

Actually no. I think you're getting confused by thinking in XPath 2 
terms, which is crucially different on these points. What 
/a/b/c/ancestor::* really returns is {/a, /a/b} which is exactly the 
same as {/a/b, /a} but not the same as (/a, /a/b) or (/a/b, /a). In 
other words, in XPath 1.0 the expression returns an unordered set. It 
does *not* return an ordered sequence as it does in XPath 2.0.

As you point out, there's nothing in the XPath 1.0 specification that 
says, one way or the other, what order it uses when it presents the 
unordered result of this XPath expression as a Java ordered list. 
Either forward or reverse ordering or something else would be legal.

Therefore I change my mind. Jaxen does not have a bug (unless, 
perhaps, something in the Jaxen spec promises to return things in a 
certain order). The bug is in the code that assumes the results will 
be returned in document order. Code that assumed reverse document 
order would be equally buggy.

So what does Jaxen actually say? Here's form the JavaDoc for selectNodes:

In most cases, nodes will be returned in document-order, as defined 
by the XML Canonicalization specification. The exception occurs when 
using XPath expressions involving the union operator (denoted with 
the pipe '|' character).

That's pretty wishy-washy, but I guess they're saying they wiull 
return the nodes in document order here and failing to do so is a 
bug, but it isn't obvious.


-- 

   Elliotte Rusty Harold
   elharo at metalab.unc.edu
   Effective XML (Addison-Wesley, 2003)
   http://www.cafeconleche.org/books/effectivexml            
   http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim/cafeaulaitA 


More information about the jdom-interest mailing list