[jdom-interest] AttributeList performance

Tatu Saloranta cowtowncoder at yahoo.com
Thu Apr 28 15:19:23 PDT 2005


--- Elias Ross <eross at m-qube.com> wrote:
> On Thu, 2005-04-28 at 13:21 -0700, Tatu Saloranta
> wrote:
> 
> > As to the access time itself, I'm fairly confident
> > that the HashMap solution is pretty much always
...
> > HashMap which is more heavy-weight than a simple
> > array/List.
> 
> It's pretty hard to come up with a solution that
> satisfies the
> degenerate case (Elements with lots of attributes)
> and the common case (one or two attributes.)

True.

> I suggested sorting the attributes and doing a
> binary search would be
> okay, but it seemed not terribly popular.

One problem is that unfortunately sorting and binary
searching are
not very inefficient with Java Strings (compared to
HashMap).
This based on comparison I made at one point; the
default\
HashMap implementation (JDK 1.4+) is pretty damn fast,
compared to about any other generic alternative, when
the main goal is fast
access time. If you have a String handy, and
especially if it has been
intern()ed, it is very hard to get anything faster.
Binary search on Strings
would be faster than linear search, but only
significantly so for larger
sets (larger than what HashMap's minimum).

Sorting + binary search would be a decent choice if
one wants to access
attributes in alphabetic (etc) order. Of course you
could also sort by
hash value, which would allow for bit faster access
(and 'random' ordering
when iterating over it), but still not as fast as
hash-based alternative...

> One solution would be to add a protected constructor
or factory to
> Element that would allow an alternative attribute
> list implementation to
> be specified.  Then, clients could create their own
> DOMBuilder which returns Elements with this new
attribute class. 
> This isn't exactly clean either, but might be good
enough.

And/or the default implementation would choose one
that seems most suitable; at least for documents built
from parser.
In this case it is known how many attributes there
are, although
usage patterns are not known. One could extend JDom to
use simple
"profiles" (read-only trees -> optimize a bit for fast
access don't worry
about mods; heavy modifications -> balance the needs;
read-and-cache -> optimize for long-running
access-have structs, for
long-living objects) that could choose more optimal
structures.

It's not actually all _that_ difficult to create a
specialized List
that has fast by-name access, compact presentation
(doesn't use
more memory than ArrayList with Attribute objects
inside), simple
indexed access, and that's still quick to construct,
given initial
size needed. This because while ArrayList and HashMap
are very good
general-purpose data containers, it's usually possible
to create
even better special-purpose ones for specific need...
and this is
one case where such a "ListMap" could be done.

I have some code available if someone is really
interested in pursuing
this. In its current form its even more highly
specialized (for the
needs of StAX 1.0 XMLStreamReader's attribute access,
and can
count on all names being intern()ed), but it would at
least show
one approach (kind of hybrid flat Map with contiguous
shared spill
area... yeah, I'm sure that pretty much sums it up!
:-) ) that
has performed nicely.

-+ Tatu +-


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


More information about the jdom-interest mailing list