[jdom-interest] META: Children of a lesser spec
brett.mclaughlin at lutris.com
Fri Sep 29 17:17:44 PDT 2000
Jason Hunter wrote:
> Little follow-up on the getChild/getChildElement debate. :-)
> Brett McLaughlin wrote:
> > > >Now, if it was only correct. But it isn't, and a cursory
> > > >reading of any XML specification makes it confusing,
> > > > especially getChildren()...
> > Even on these grounds, you will find
> > that XML Infoset clearly defines content, and children.
> > >From that spec:
> > <<<<<<<<<<<<<
> > An element information item has the following properties:
> > ...
> > 3.[children] An ordered list of child information items, in document
> > order. This list contains element , processing instruction,
> > reference to skipped entity, character, and comment information
> > items, one for each element, processing instruction,
> > reference to an unprocessed external entity,
> > data character. and comment appearing immediately within
> > the current element, .....
> I think it's good to ask which specs we want to use for terminology.
> For example, looking at the XML 1.0 spec I see this production rule:
>  contentspec ::= 'EMPTY' | 'ANY' | Mixed | children
>  children ::= (choice | seq) ('?' | '*' | '+')?
> You could see this as an example where the XML spec refers to the
> exclusive set of subelements as "children" not "childElements".
> What does XML call "children"? See production rule 47. It's only
Yes, and no. What you've pointed out is correct; however, you've found
an inconsistency in the spec, as compared to other specs (yes, this is a
major problem)! As I've always said, getChildren() is not "just
horrible" - it's perfectly sensible from the Java point of view, and
pretty muddled from the XML point of view. I guess as that is the case,
I would rather avoid confusion in both cases.
See, the way I look at it, getChild()/getChildren() is clear for Java,
and muddy for XML.
But getChildElement()/getChildElements() is clear for both Java and XML,
albeit a little inconvenient for Java.
I can't see how it's bad to be a little inconvenient to make it clear
for all; however, it's only fair to say that because of my work and
book, I deal with Java people and XML people equally. Many of you are
exclusively on the Java side, and this may be onerous...
ahhh... tiring... tiring ;-)
> Also, in Infoset I see this:
> 2.3.1. Attributes: Core Properties
> An attribute information item must have the following
> properties available in some form:
> 1.[namespace URI] The URI part, if any, of the attribute's name.
> 2.[local name] The local part of the attribute's name. This does
> not include any namespace prefix or following colon.
> 3.[children] An ordered list of references to character
> information items, one for each character appearing in the
> normalized attribute value.
> This makes me think that if we went with 100% Infoset terminology then
> instead of Attribute.getValue() it needs to be Attribute.getChildren()
> -- returning the characters which by this terminology are called
> children. I think we can all agree (hopefully) that that's ridiculous.
> For a final example, let's look at the XPath spec:
> child::* selects all element children of the context node
> child::text() selects all text node children of the context node
> child::node() selects all the children of the context node, whatever
> their node type
> Pretty clear that child::* returns only *element* children. You use
> child::node() to return a list of all types. When XPath says "child" it
> also means element children.
> So the XML 1.0 spec has production rules 46/47 which refer to "children"
> to mean only "direct subelements". The Infoset spec is clear as Brett
> said, but because of its odd naming conventions regarding children in
> the attribute case, I could easily argue we're not honor bound to follow
> its terminology exactly. And XPath clearly has child::* refer to
> elements only.
> Bottom line: as I stated earlier, I prefer the elegance of getChild /
> getChildren over getChildElement / getChildElements. It matches nicely
> with getParent. It seems less redundant. It's easier to differentiate
> singular from plural. The problem has always been spec terminology, and
> looking at three XML specs today I don't think we'd be in gross
> violation using getChild instead of getChildElement, and using the
> simpler name definitely "fits" better with the API and I see that as
> more important than being robotically precise and including the return
> type within the method name.
> OK, back to real work.
Brett McLaughlin, Enhydra Strategist
Lutris Technologies, Inc.
1200 Pacific Avenue, Suite 300
Santa Cruz, CA 95060 USA
More information about the jdom-interest