[jdom-interest] detach() [eg]

Sat Apr 28 15:33:14 PDT 2001

> My concern is about the impact on consumers of documents.  (XMLOutputter is
> a trivial example.)
> 
>     class Consumer {
>         void consume(Document doc) {
>         }
>     }

I would argue if a document without a root is passed to this method,
it's a programmer error, appropriate for an ISE.

> (But we've been down this path before.  By trying to be nice, we're lowering
> the quality of our API and making it harder for JDOM users to write robust
> code.)

Yep, there's no easy answer.  Having RuntimeException be unchecked in
Java was a similar compromise -- that decision made it harder to write
robust code but easier to write code in the first place.  :-)

Here's the bottom line so we can end this thread... 

Doing a "move" without detach() requires three checks and a lot of
special casing:

1) attached to a doc -- call setRootElement() on another placeholder elt
2) attached to an elt -- call getParent().removeContent()
3) attached to nothing -- know to call nothing or risk NPE

Here's code:

if (elt.isRootElement()) {
  elt.getDocument().setRootElement(new Element("Bogus"));
} else if (elt.getParent() != null) {
  elt.getParent().removeContent(elt);
}
newelt.addContent(elt);

It's ugly.  Of course many programmers will just write

newelt.addContent(elt.getParent().removeContent(elt));

...because it's easier and we even had an experienced JDOM user here
suggest it.  The problem is although it works most of the time,
sometimes you get a nasty NPE.

So I'm convinced we need detach().

Having detach() throw an exception if the elt being detached is a root
is better than not having detach(), and is a good behavior from an
academic point of view.  But it still leaves some ugly special casing to
the programmer:

1) attached to a doc -- call setRootElement() on another
2) attached to an elt -- call detach()
3) attached to nothing -- call detach()

The code for a move:

if (elt.isRootElement()) {
  elt.getDocument().setRootElement(new Element("Bogus"));
} else {
  elt.detach();
}
newelt.addContent(elt);

Not much better, and again many programmers will just write:

newelt.addContent(elt.detach());

... because it's easier.  But sometime later elt will be a root and
they'll be surprised with an exception.

Now, let me state I believe the vast majority of use cases where you'd
detach a root are that you're harvesting the root and don't care about
the old doc.  A programmer doing the detach understands that if they
take the root it's gone from the original document, and that's OK
because they aren't going to use the old doc.  These programmers just
want to remove the root and use it.  

With this in mind, it seems reasonable to somehow make the detach() of a
root elt succeed.  This behavior gives the programmer no surprise on a
detach() call because there's no exception to be thrown, and it gives
the programmer very little surprise that the old document doesn't have a
root because they were the ones to remove it.

The problem with the original doc not having a root is that XML wants
all docs to have a root element, and we're trying to provide a
well-formedness guarantee as a feature.  So we have three choices:

1) Add a placeholder.  This is what we do now.  It means we can say "all
docs are well formed", and it allows a programmer to detach a root
without jumping through hoops.  But this thread started because people
didn't like it.

2) Return null from getRootElement() and a list without an Element from
getMixedContent().  This suffers from the problem that someone might
output the document and not realize it's not well formed.  It totally
punts on well-formedness and makes the user check.

3) Report an ISE on getRootElement() and getMixedContent() on a doc
whose root has been detached or which has been constructed without a
root.  This allows the "move" FAQ answer to be
newelt.addContent(elt.detach()) with that call working in all cases.  It
removes most surprise from the programmer because only if they try to
access a root element which *they earlier removed* will they hit an
exception.  Even the vast majority of people who don't realize XML
requires each doc to have a root will understand the original rootless
document is pretty useless without data and they won't call on it.  We'd
still enforce well-formedness, but with regard to the root element it's
a "lazy" enforcement.

No solution above is perfect, but we have to decide and move on.  The
top two favorites in my mind and that I've heard supported here are (a)
throwing an exception on root.detach() or (b) allowing the detach() but
throwing an ISE upon later rootless document use.  Considering the code
shown above for option (a) in that you would have to create and
substitute your own bogus element before moving a root, I prefer (b).

So, unless there are any *new* facts on the topic, I'd like to go with
(b).

-jh-