[jdom-interest] [OT] HTML in XML

Phil Weighill-Smith phil.weighill-smith at volantis.com
Thu Apr 22 03:17:03 PDT 2004


You can use CDATA blocks to prevent interpretation of XML special
characters such as "<", ">" and "&" (this avoids having to explictly
encode these characters). When outputting the stuff you would have to
prevent the CDATA markup itself from being written out, but just its
content. That way the output would contain the required markup.

A better approach would be to use "HTML Tidy" to convert "standard HTML"
(SGML) to XHTML (XML) before you use it.

See http://tidy.sourceforge.net/.

Phil :n)

On Thu, 2004-04-22 at 11:09, Eric VERGNAUD wrote:

> le 22/04/04 11:41, Phil Weighill-Smith à phil.weighill-smith at volantis.com a
> écrit :
> 
> > Can you not write XHTML instead?
> 
> Believe me, I would if I could. But I'm not the writer. The HTML is written
> by end users, stored in my app, and sent to other apps on which I have no
> control.
> 
> > A common way around browser
> > incompatibility between HTML (SGML) and XHTML (XML) is to ensure that
> > there is a space before the "/>", e.g. "<br>" is written as "<br />".
> > This works in most browsers.
> 
> But I also need to support <b>Hello</b>, where both tags must be part of the
> text value.
> 
> Currently I convert the "<" to "&lt;" and ">" to "&gt;". I copied this
> method from Adobe's xmp format, but I find it rather heavy, and I don't know
> if it's official. 
> 
> Eric
> 
> _______________________________________________
> To control your jdom-interest membership:
> http://lists.denveronline.net/mailman/options/jdom-interest/youraddr@yourhost.com

-- 
Phil Weighill-Smith <phil.weighill-smith at volantis.com>
Volantis Systems
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://jdom.org/pipermail/jdom-interest/attachments/20040422/ea7bfafd/attachment.htm


More information about the jdom-interest mailing list