[jdom-interest] Re: First impressions and some suggestions

Sun Jun 11 13:02:52 PDT 2000

> 
> I don't disagree with the spec at all!  The spec writers realized that
> some applications would require the surrounding whitespace, so they made
> it clear that a parser couldn't ignore the whitespace and had to pass it
> on; in other words, had to make it available to applications that needed
> it.  That's exactly what we're doing!  The spec doesn't say,
> "Surrounding whitespace must be available by a method with a shorter
> signature than the method that returns non-surrounding whitespace." 
> That's our decision.
>

First, I'm beginning to agree with you that making the data
available pulls JDOM within the spec for this point.  However,
I still think it is very inefficient to offer the naive user
(who, by definition, doesn't know any better), a method that trims
the string each time that it is accessed.  If the user
does not want the extra white space, remove it at parse time.
Alternatively, if the user does want the white space, include
it at parse time.  Have a flag in the builder that handles
this.  Even if it is off by default (no extra white space),
it is probably still spec compliant (with the above reading of
the spec).  This gives a consistent interface, expected results by
both novice and expert, and is much more efficient -- the
naive user (who we are trying to protect) will store the String
content in the JDOM and call the getContent() method over
and over and over again, implicitly calling trim() each time.

I really believe that what the user is going to expect (novice
and expert alike) is to have getContent() return that actual
content of the element.  Whether or not white space has been
provided by the builder is a separate question.  Element.getContent()
should return the data in the Element, period.

If we really want to move JDOM a bit closer to the spec,
and add important functionality that many people would
expect, I would suggest the following:

1) As the default, leave everything the way it is now.  This will leave
   all textual materials about JDOM accurate.

2) *** Offer a builder flag that allows white space to be included in
   the Element (with getContent() returning the extra data). 

3) Add support for CDATA sections on both input and output (with
   a builder flag).

4) *** Add support for an IgnorableWhitespace class that can
   be included in the JDOM (with a builder flag).  This would correspond
   to the SAX ignorableWhitespace() method. (use a builder flag for
   this)

5) Add support for including or excluding comments (use a builder flag
   for this).

6) Add a getElementById() method in the Document element, and recognize
   IDREF elements while parsing (and check for duplicate IDs during
   validation).

7) For #5, add the ability to assign IDs to an Element (and check for
   duplicate IDs), and check the ID for a given Element object.

8) Have an XMLOutputter that is DTD/Schema aware and outputs
   data in a format that will not have the potential of corrupting
   it. (this one is a little out there, but I thought I would throw
   it out as an idea).

9) Give the ability to determine which attributes were supplied
   by default (from DTD/Schema).

10) A normalize method that would join all concurrent Strings
    (is this needed, or does JDOM take care of all scenarios
    where this could happen?)

(*** denotes poritons of the spec that are currently not
implemented in JDOM)

I believe that all of these (except #8) can be implemented
with a minimal amount of code and increased JAR footprint.
In addition, I believe that they will make
the product much more robust and useable by the developer
community.  Finally, they are classes/methods/options that can go
unnoticed if the user is not interested in using them.

I realize that you are shooting to get the core of this
done as soon as possible.  However, I think that the above
items should be given serious consideration.  As a developer,
I know that I would use all of the above.  I'm wondering if
anyone else would find them useful (or crucial) to their
work?

--Kevin