[jdom-interest] Internal DTD subsets

Harry Evans hevans at elite.com
Wed Jun 13 13:27:49 PDT 2001

Sorry about that.  Guess who is behind on the jdom list?

As to the questions you posted:
1. No idea how to solve the character entity issue.  I was having the same
2. I think notations should also be held in the internal subset.  It didn't
look super complicated and I think it would contribute to completeness.
3. Given the way the parsers parse this stuff, whitespace is going to be
arbitrary.  I think we have to go for semantic equivalence for the internal
subset stuff, not character for character replication.  Spacing and stuff
shouldn't matter on this.
4. According to the spec at the w3c ( taken from
http://www.w3.org/TR/2000/REC-xml-20001006 ), the productions for entities
in general are:
[70]    EntityDecl::=    GEDecl | PEDecl 
[71]    GEDecl    ::=    '<!ENTITY' S Name S EntityDef S? '>' 
[72]    PEDecl    ::=    '<!ENTITY' S '%' S Name S PEDef S? '>' 
[73]    EntityDef ::=    EntityValue | (ExternalID NDataDecl?) 
[74]    PEDef     ::=    EntityValue | ExternalID 

If you look at [72] the spec specifically lists a space after the %, so I
would think that <!ENTITY % e "foo"> != <!ENTITY %e "foo">, but omitting
this might be a common usage thing I am unaware of.

5. I say no parameter entity expansion.  While it might be a little short
sighted, I think the purpose of this feature is to preserve the information
that JDOM currently doesn't use in the internal subset string, to allow
documents to be parsed, written, and reparsed without problems.  I mean, we
are storing this thing in a String, so I think the less we change it from
the original document representation, the better off we are.

I hope this helps in even a small way.  Are your changes in CVS?  I did a
checkout yesterday, and didn't see this stuff in the code.  If not, do you
think you could send me a zip or diff of the changed files, so that I help
you with this, if only on the issues you already posted?

Harry Evans

-----Original Message-----
From: philip.nelson at omniresources.com
[mailto:philip.nelson at omniresources.com]
Sent: Tuesday, June 12, 2001 8:58 PM
To: hevans at elite.com; jdom-interest at jdom.org
Subject: RE: [jdom-interest] Internal DTD subsets

I already have this done!  Perhaps you could answer some of the questions I
posed last week on this based on what you have done so far.

As for nasty complicated test cases, go to OASIS and get the xml conformance
suite.  I went through about 75 of the James Clark tests.

-----Original Message-----
From: Harry Evans
To: jdom-interest at jdom.org
Sent: 6/12/01 7:30 PM
Subject: [jdom-interest] Internal DTD subsets

So I am finally finishing the implementation of the internal DTD subsets
that I started a while ago.  Jason has said, "Read my lips, 'No New
Objects'" in response to my diabolic plans to introduce objects to
the declarations inside an internal DTD.  

Therefore, I am looking at hanging all this wonderful information off of
lonely String floating around inside of the DocType Object.  Right now,
plan to store the actual declarations and newlines, but no other
goodies in this String.  Post to the list if you think it should be

If someone could come up with a few nasty complicated internal DTDs that
can test with, that would speed up this process immensely.  Post those
the list, or just email me.

Also, the String holding all this stuff is just called subset right now.
anyone can think of a better name, let me know, and I will change it.

Anyone wondering what this post is all about, just read the TODO list,
search for my name.

Harry Evans
To control your jdom-interest membership:
To control your jdom-interest membership:

More information about the jdom-interest mailing list