[jdom-interest] Thread questions regarding JDOM SAXBuiler?

Phill_Perryman at Mitel.COM Phill_Perryman at Mitel.COM
Tue Aug 31 09:09:54 PDT 2004


I tried using the latest (non queue version) and got the following, shows 
some quite dramatic savings on large files.

Reuse=true      size=1884       time: 595
Reuse=false     size=1884       time: 187

Reuse=true      size=21363      time: 440
Reuse=false     size=21363      time: 438

Reuse=true      size=42527      time: 703
Reuse=false     size=42527      time: 563

Reuse=true      size=318591     time: 8420
Reuse=false     size=318591     time: 5503

Reuse=true      size=743381     time: 23266
Reuse=false     size=743381     time: 13844

However, I then changed the code to use _builder in both the if and else 
(just to see what the natural variation would be like) and got the 
following, now I am really confused as the re-use = false does the same 
stuff as re-use = true, it also instantiates a new object and it still 
takes less time.

if (_reuse) {
        _builder.build(source);
} else {
        SAXBuilder builder = new SAXBuilder();
        _builder.build(source);
}

Reuse=true      size=1884       time: 513
Reuse=false     size=1884       time: 140

Reuse=true      size=21363      time: 406
Reuse=false     size=21363      time: 297

Reuse=true      size=42527      time: 594
Reuse=false     size=42527      time: 734

Reuse=true      size=318591     time: 7719
Reuse=false     size=318591     time: 5486

Reuse=true      size=743381     time: 22297
Reuse=false     size=743381     time: 14423

/Phill
IS Dept, Software Engineer.
phill_perryman at mitel.com
http://www.mitel.com
Tel: +44 1291 436023




Per Norrman <per.norrman at austers.se>
Sent by: jdom-interest-bounces at servlets.com
31/08/2004 07:39

 
        To:     David Wall <d.wall at computer.org>
        cc:     jdom-interest at jdom.org
        Subject:        Re: [jdom-interest] Thread questions regarding JDOM SAXBuiler?


Hi,

I meant to make the program self-cotained but missed the dependency
on the concurrent jar. Here's a new version. You should run the test
in your environment to confirm the results.

Yes, documents are discarded after being built. There are many variations
you can do in a test like this. My guess is that it's String/StringBuffer
handling in SAXBuilder and/or Xerces that accounts for the resuts.

A typical output in my environment (P3, 850Mhz, Dell Latitude C600):

Reuse=true               size=21731              time: 5720
Reuse=false              size=21731              time: 2215

Reuse=true               size=1918               time: 200
Reuse=false              size=1918               time: 300

Reuse=true               size=21731              time: 1200
Reuse=false              size=21731              time: 2065

Reuse=true               size=43259              time: 3697
Reuse=false              size=43259              time: 2663

Reuse=true               size=324070             time: 25435
Reuse=false              size=324070             time: 22233

Reuse=true               size=756109             time: 66417
Reuse=false              size=756109             time: 53194

The first run should be disregarded. Used for warming-up.

/pmn

David Wall wrote:

> Peter,
> 
> Thanks for your input.  Can you share the results you got?
> 
> Can anybody explain that behavior?  It sounds suspect.  Of course, the 
cost
> of creating a SAXBuilder should go down relative to the time for parsing 
as
> the XML file gets bigger, but the cost of construction shouldn't change 
much
> unless there's a memory leak in the program.  For example, are the 
Documents
> created from build() being destroyed?  Is it just the garbage collector
> that's entering the picture?  I know that the modern GC does well with 
lots
> of small objects coming and going because that's the most typical 
scenario
> (especially String).  But it seems odd that the construction of an 
object
> would change just because bigger XML files are used in the build() 
method.
> 

package large;

import java.io.StringReader;
import java.text.DateFormat;
import java.text.DateFormatSymbols;
import java.util.Calendar;
import java.util.Date;

import org.jdom.Comment;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.input.SAXBuilder;
import org.jdom.output.XMLOutputter;
import org.xml.sax.InputSource;

/**
 * @author Per Norrman
 * 
 */
public class ThreadedReader {
    private boolean _reuse = true;

    private String _xml = "";

    private long _time = 0;

    public ThreadedReader(boolean reuse) {
        _reuse = reuse;
    }

    public synchronized void addTime(long elapsed) {
        _time += elapsed;
    }

    public synchronized long getTime() {
        return _time;
    }

    public void reset() {
        _time = 0;
    }

    public void process(String start, String end, int count) throws 
Exception {
        reset();
        generate(start, end);

        // create threads
        int each = count / 5;
        Thread[] thread = new Thread[5];
        for (int i = 0; i < 5; ++i) {
            thread[i] = new ReaderThread(_reuse, each);
            thread[i].start();
        }

        for (int i = 0; i < 5; ++i) {
            thread[i].join();
        }

        // report
        System.out.println("Reuse=" + _reuse + "\tsize=" + _xml.length()
                + "\ttime: " + getTime());
    }

    public void generate(String startDate, String endDate) throws 
Exception {
        DateFormat df = DateFormat.getDateInstance(DateFormat.SHORT);
        DateFormatSymbols dfs = new DateFormatSymbols();
        String[] weekDays = dfs.getWeekdays();

        Element root = new Element("root");
        Document doc = new Document(root);
        doc.getContent().add(0,
                new Comment(" Generated: " + df.format(new Date()) + " 
"));

        Calendar cal = Calendar.getInstance();
        Date start = df.parse(startDate);
        Date end = df.parse(endDate);

        cal.setTime(start);
        while (cal.getTime().before(end)) {
            Element date = new Element("day");
            date.addContent(new Element("date").setText(df
                    .format(cal.getTime())));
            root.addContent(date);
            String weekDay = weekDays[cal.get(Calendar.DAY_OF_WEEK)];
            Element day = new Element("dayname").setText(weekDay);
            date.addContent(day);
            cal.add(Calendar.DATE, 1);
        }

        XMLOutputter out = new XMLOutputter();

        _xml = out.outputString(doc);

    }

    public static void test(String start, String end) throws Exception {
        System.out.println();
        new ThreadedReader(true).process(start, end, 20);
        new ThreadedReader(false).process(start, end, 20);
    }

    public static void main(String[] args) throws Exception {
        test("2000-01-01", "2001-01-01");
        test("2000-01-01", "2000-02-01");
        test("2000-01-01", "2001-01-01");
        test("2000-01-01", "2001-12-31");
        test("1990-01-01", "2004-12-31");
        test("1970-01-01", "2004-12-31");
    }

    private class ReaderThread extends Thread {
        private boolean _reuse = true;

        private int _count = 0;

        private SAXBuilder _builder = new SAXBuilder();

        public ReaderThread(boolean reuse, int count) {
            _reuse = reuse;
            _count = count;
            _builder.setReuseParser(reuse);
        }

        private void parse(InputSource source) {
            long elapsed = 0;
            try {
                elapsed = System.currentTimeMillis();
                if (_reuse) {
                    _builder.build(source);
                } else {
                    SAXBuilder builder = new SAXBuilder();
                    _builder.build(source);
                }
                elapsed = System.currentTimeMillis() - elapsed;
                addTime(elapsed);
            } catch (Exception e) {
                System.out.println(getName() + ": " + e.getMessage());
            }
        }

        public void run() {
            while (_count-- > 0) {
                parse(new InputSource(new StringReader(_xml)));
            }
        }
    }

}_______________________________________________
To control your jdom-interest membership:
http://www.jdom.org/mailman/options/jdom-interest/youraddr@yourhost.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://servlets.com/pipermail/jdom-interest/attachments/20040831/cd6f6686/attachment.htm


More information about the jdom-interest mailing list