[jdom-interest] META: Children of a lesser spec

Fri Sep 29 17:17:44 PDT 2000

Jason Hunter wrote:
> 
> Little follow-up on the getChild/getChildElement debate.  :-)
> 
> Brett McLaughlin wrote:
> >
> > > >Now, if it was only correct. But it isn't, and a cursory
> > > >reading of any XML specification makes it confusing,
> > > > especially getChildren()...
> 
> > Even on these grounds, you will find
> > that XML Infoset clearly defines content, and children.
> >
> > >From that spec:
> >
> > <<<<<<<<<<<<<
> >
> > An element information item has the following properties:
> >
> > ...
> >
> > 3.[children] An ordered list of child information items, in document
> > order. This list contains element , processing instruction,
> > reference to skipped entity, character, and comment information
> > items, one for each element, processing instruction,
> > reference to an unprocessed external entity,
> > data character. and comment appearing immediately within
> > the current element, .....
> 
> I think it's good to ask which specs we want to use for terminology.
> For example, looking at the XML 1.0 spec I see this production rule:
> 
>   [46] contentspec ::= 'EMPTY' | 'ANY' | Mixed | children
>   [47] children ::= (choice | seq) ('?' | '*' | '+')?
> 
> You could see this as an example where the XML spec refers to the
> exclusive set of subelements as "children" not "childElements".
> What does XML call "children"?  See production rule 47.  It's only
> elements.

Yes, and no. What you've pointed out is correct; however, you've found
an inconsistency in the spec, as compared to other specs (yes, this is a
major problem)! As I've always said, getChildren() is not "just
horrible" - it's perfectly sensible from the Java point of view, and
pretty muddled from the XML point of view. I guess as that is the case,
I would rather avoid confusion in both cases.

See, the way I look at it, getChild()/getChildren() is clear for Java,
and muddy for XML.
But getChildElement()/getChildElements() is clear for both Java and XML,
albeit a little inconvenient for Java.

I can't see how it's bad to be a little inconvenient to make it clear
for all; however, it's only fair to say that because of my work and
book, I deal with Java people and XML people equally. Many of you are
exclusively on the Java side, and this may be onerous...

ahhh... tiring... tiring ;-)

-Brett

> 
> Also, in Infoset I see this:
> 
>    2.3.1. Attributes: Core Properties
> 
>    An attribute information item must have the following
>    properties available in some form:
> 
>    1.[namespace URI] The URI part, if any, of the attribute's name.
>    2.[local name] The local part of the attribute's name. This does
>    not include any namespace prefix or following colon.
>    3.[children] An ordered list of references to character
>    information items, one for each character appearing in the
>    normalized attribute value.
> 
> This makes me think that if we went with 100% Infoset terminology then
> instead of Attribute.getValue() it needs to be Attribute.getChildren()
> -- returning the characters which by this terminology are called
> children.  I think we can all agree (hopefully) that that's ridiculous.
> 
> For a final example, let's look at the XPath spec:
> 
>  child::* selects all element children of the context node
>  child::text() selects all text node children of the context node
>  child::node() selects all the children of the context node, whatever
>  their node type
> 
> Pretty clear that child::* returns only *element* children.  You use
> child::node() to return a list of all types.  When XPath says "child" it
> also means element children.
> 
> So the XML 1.0 spec has production rules 46/47 which refer to "children"
> to mean only "direct subelements".  The Infoset spec is clear as Brett
> said, but because of its odd naming conventions regarding children in
> the attribute case, I could easily argue we're not honor bound to follow
> its terminology exactly.  And XPath clearly has child::* refer to
> elements only.
> 
> Bottom line: as I stated earlier, I prefer the elegance of getChild /
> getChildren over getChildElement / getChildElements.  It matches nicely
> with getParent.  It seems less redundant.  It's easier to differentiate
> singular from plural.  The problem has always been spec terminology, and
> looking at three XML specs today I don't think we'd be in gross
> violation using getChild instead of getChildElement, and using the
> simpler name definitely "fits" better with the API and I see that as
> more important than being robotically precise and including the return
> type within the method name.
> 
> OK, back to real work.
> 
> -jh-

-- 
Brett McLaughlin, Enhydra Strategist
Lutris Technologies, Inc. 
1200 Pacific Avenue, Suite 300 
Santa Cruz, CA 95060 USA 
http://www.lutris.com
http://www.enhydra.org