[jdom-interest] XMLOutputter andnewlinesafterdeclaration/doctype

Alex Rosen arosen at novell.com
Fri Dec 20 09:03:30 PST 2002


Yup, I was talking about text editors.

That is a good point about not handling newlines in Vadim's case (which
is separate from the case that I'm talking about). Although, what if he
used a FilterOutputStream to post-process the output of XMLOutputter,
and replaces all newline characters with &x10; or &x13; as appropriate?
Are character references allowed outside of the root element (e.g. right
after the XML declaration)?

Alex

>>> Elliotte Rusty Harold <elharo at metalab.unc.edu> 12/20/02 05:03AM
>>>
At 5:21 PM -0700 12/19/02, Alex Rosen wrote:

>Regardless of this particular case, I don't think that being an XML
>Nazi (pardon the expression) is the way to go in general. Plenty of
>people use non-XML tools on their XML documents, if only to look at
>them. This won't change any time soon. So, things outside the spec do
>matter. Maybe in an ideal world they wouldn't, but it the real world
>they do. I don't think there should be any hard and fast rule of XML
>infoset good, any other syntax bad.

The primary non-XML tool used on such data is a text editor. The 
prevalence of that completely swamps all other non-XML use cases. 
That's where JDOM needs to default to when it has a choice.

Beyond text editors, most non-parser based tools fail when presented 
with XML documents sooner or later, normally sooner. Regular 
expressions can't handle markup embedded in CDATA sections, comments 
and processing instructions. Using XML legal characters such as line 
feeds as document delimiters in a single file (as Vadim wants to do) 
fails as soon as some of your data contains that character, even if a 
comment or a tag.

Sooner or later everyone who tries to do this learns this lesson: you 
need a parser to handle XML. Nothing less will do. If a parser is too 
heavyweight for you (which is rarely true), then you need to use 
something other than XML.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo at metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+
_______________________________________________
To control your jdom-interest membership:
http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com



More information about the jdom-interest mailing list