[jdom-interest] CDATA data (followup with more info)

Duane Morin dmorin at lear.morinfamily.com
Mon Nov 4 08:24:23 PST 2002


I've attached the actual file that's causing me grief (as "badchar.txt").  
The line in question is the Subject:. I assume the body would cause me 
similar grief, but for right now i'm just throwing the body out.

I went through and took characters out of the Subject line a little at a 
time.  At various times I got IllegalData exceptions on: a7, b6, a4, ad, 
a8, a6, bf.  Note that all exceptions occurred during the load, not during 
the initial creation or the save.

I've also included a snippet of the code I was using to read the file and 
stuff it into an Element ("baddatachecker.java").  I added the 
Verifier.checkCDATASection() line, but it doesn't fail!  

I've also included my save() and load() routines.  I've done nothing to 
muck with the encoding, and calls to getEncoding() during both load and 
save tell me ISO8859_1.

The scope for this personal project is NOT to write something that is 
internationalized (just doing a part time thing for fun on the train, and 
no it's not homework :)).  I just want to identify the fact that I'm 
creating an illegal Element, and bail out.  The line in question is 
actually the only one in a 1000 line test file that fails in this way, so 
I can't see going out of my way to test every line if I can help it.

Any help greatly appreciated!

Duane


-------------- next part --------------
From real at h8h.com.tw Mon Oct 21 09:35:49 2002
Date: 21 Oct 2002 21:31:48 +0800
From: real at h8h.com.tw
To: dmorin at morinfamily.com
Subject: [ISO-8859-1] ?i?D?A?@?????\?????????K!!???F?@?w?~?@??~


?o?O?e?U???M?~?s?i???q?N?o???????^?H?L?k???? ?? ~ !

________________________________________________________________________________


                  [IMAGE] ?????O?z?B?~???? ~ ?O?????@?e???@??
                  [IMAGE] ?N???O?????????? ~ ?O???????\??????
                  [IMAGE] ?????W???????H?? ~ ?O??????????????


?b???????q???????A?@?V???O???]???b?_?????p?U?g???A???O?????????O?s?i???P???k???_
             ???s???U???z?????s?A???O?????N?O???X?A???O???i???P?I?C

?K?Q?W?????????A?????O?????M???o?F?????K???A???o?]?I?X?F?????????????R???P??????
?~?F???j???q?c???????i?n?A???????O???b???K?y?????????U?A?h?R?F?@?????O?????n????
                           ?~?A?????????n?????O???O?C

???????A?N?n?H?G?Q?@?@?????l?z???z???A?????????i?H?w???z???O?A???P???~?w?????z?L
?????O???I???o?g???v?????z???A???y?????z?????O???A???????~?P?A?u?????a?????O?A?N
?i?H???o???~?g???v?A?????O???A?z?L?b?????W?????O?A?????v?????A?????A?q?????~?A??
        ?b?q?l?????P?v?t?e?f???A???U?A?????H?b?a?????A?f???e?W?????K?Q?C

?????n???O?A?b???????????p?e?????z???O?H?~?A???i?]?????o?@?????????Q?a?s???W????
???~?g???v?A?b???_?a?N?????W?????z???????X?h???P???A?]???P?????N?F?z???@?f???~?A
???I?z???????H???A?N?O?o?????P?????@???^?X???H?????P?????@?P?g?????????~?z???A??
               ???F?o???H???H???p???????D?D?????N???P???????W????
                -------------------------------------->?u?W???T
                                       ?@

________________________________________________________________________________

?p?????Z???????A???Q?A???????H????  ->  (?????s?i)


-------------- next part --------------
    private Element processMessage(File f) {
	Element x = new Element("Message");
	Element h = new Element("Header");
	//	Element b = new Element("Body");
	x.setAttribute(new Attribute("src", f.getAbsolutePath()));
	x.addContent(h);
	Element line;
	try {
	    BufferedReader r = new BufferedReader(new FileReader(f));
	    boolean reading = true;
	    boolean header=true;
	    long size = 0;
	    while (reading) {
		String l = r.readLine();
		if (l==null) reading = false;
		else {
		    if (l.equals("")) header=false;
		    else {
			if (header) {
			    line = new Element("Line");
			    if (Verifier.checkCDATASection(l)==null) {
				line.addContent(new CDATA(l));
				h.addContent(line);
			    } else { logger.error("Bad data found"); return null;}
			}
			else {
			    size += l.length();
			}
		    }
		}
	    }
	    x.setAttribute("length", Long.toString(size));
	} catch (Exception e) {
	    logger.error(e);
	    e.printStackTrace();
	    x.removeContent(h);
	}
	logger.debug("Must be a good one.");
	return x;
    }
    private boolean load() {
	logger.debug("Loading mailboxrepository...");
	try {
	    File f = new File("/home/dmorin/mailboxrepository.xml");
	    FileReader r = new FileReader(f);
	    logger.debug("loading with encoding="+r.getEncoding());
	    SAXBuilder builder = new SAXBuilder();
	    indexer = builder.build(f);
	    process();
	} catch (Exception e) {
	    logger.error(e);
	    logger.debug("load failed.");
	    return false;
	}
	logger.debug("load succeeded.");
	return true;
    }
	    
    private boolean save() {
	logger.debug("Saving repository...");
	XMLOutputter output = new XMLOutputter("", true);
	try {
	FileWriter f = new FileWriter("/home/dmorin/mailboxrepository.xml");
	logger.debug("encoding="+f.getEncoding());
	    output.output(indexer, f);
	} catch (Exception e) {
	    logger.error(e);
	    logger.debug("Save failed.");
	    return false;
	}
	logger.debug("Save succeeded.");
	return true;
    }


More information about the jdom-interest mailing list