[jdom-interest] fusing xml files

Mike Brenner mikeb at mitre.org
Tue Feb 3 08:16:20 PST 2004

Hi Joachim,

You can find OWL information at several places on the web, including the following.


OWL format is XML encoding semantic information (a la the "semantic web")
in a form that is logically equivalent to "triples".

Each triple is effectively the subject, verb, and
object of a simple sentence. The subjects and verbs are
usually represented as URIs. The verbs represent
the static relationship between the subject and the verb.

Approximate relationships, probabilistic relationships, 
mappings from the literal meaning of semantics to real-world
contexts, multiple context, and partial contexts are not
fully representable in OWL format, but simple relationships are.

Step 1 is not the answer to your original question, rather Steps 1 through 8
together answer that question in one of the two ways I know how to do it.

The other way to do it is to use a complex systems approach,
giving each tag a weight (literally) in a gravitational universe,
and orbit them such that those concepts that are close to each other
tend to converge towards each other. This is a little bit hard
to visualize until you actually see it, and it can help to resolve
the more difficult decisions as to what goes with what.

> "J. Albers" wrote:
> Any idea how this could be done? anyone? I searched the internet for
> information on the OWL format, didn;t find very much...

Mike wrote:
> Step 1. Use an ontology in OWL format to determine which tags are related
> and how related they are.

> ----- Original Message -----
> From: "Mike Brenner" <mikeb at mitre.org>
> To: "J. Albers" <jalbers at cs.uu.nl>; <jdom-interest at jdom.org>
> Sent: Wednesday, January 28, 2004 2:01 PM
> Subject: Re: [jdom-interest] fusing xml files
> > Step 1. Use an ontology in OWL format to determine which tags are related
> and how related they are.
> > Step 2. Weight each of the relationships (arrows) in the ontology) with a
> number.
> > Step 3. Create a semantic distance metric by walking the OWL tree, adding
> up the
> > weights of the branches you must travel to get from each tag to each other
> tag.
> > Step 4. Sort the tag pairs in reverse order of increasing semantic
> distance.
> > Step 5. Use JDOM to read the first xml file into a HashMap of HashMaps of
> HashMaps.
> > Step 6. Same for the second xml file.
> > Step 7. Using the tag pair with the least semantic weight (closest
> distance to each other),
> > ask the user's permission to merge that pair. If given, carry out a simple
> > recursive merge loop (use the algorithm in Dijkstra's Discipline of
> Programming
> > for the merge loop, and recursively walk all the children of the HashMap
> of HashMaps
> > for the recursive part) to bring together those tags.
> > Step 8. Continue through all the tag pairs.

> > > "J. Albers" wrote:
> > > What i'm trying to make is some application that takes 2 XML files and
> tries to fuse them together semi-automatically. So the elements that are the
> same get fused right away, and then the rest of the elements from the files
> is listed, and one can select the elements that are the same but have
> different names in different files and fuse them.

More information about the jdom-interest mailing list