[jdom-commits] CVS update: jdom/src/java/org/jdom

jhunter at cvs.jdom.org jhunter at cvs.jdom.org
Mon Dec 18 22:58:37 PST 2000

Date:	Tuesday December 19, 2000 @ 6:58
Author:	jhunter

Update of /home/cvspublic/jdom/src/java/org/jdom
In directory www.nmemonix.com:/tmp/cvs-serv1590

Modified Files:
Log Message:
Improved the Verifier.isXML*() methods to operate much faster.  Calling 
isXMLLetter on random chars in the range 0-255 shows about a 4x speedup.  
The basic logic is to change:

if (c >= 0x0041 && c <= 0x005A) return true;
if (c >= 0x0061 && c <= 0x007A) return true;
if (c >= 0x00C0 && c <= 0x00D6) return true;


if (c < 0x0041) return false;  if (c <= 0x005a) return true;
if (c < 0x0061) return false;  if (c <= 0x007A) return true;
if (c < 0x00C0) return false;  if (c <= 0x00D6) return true;

This way we short circuit as soon as the check has passed the char's value.
The old logic was fast for "true" results but was a slow exhaustive search 
for "false" results.  That caused a problem because many calls are like:

return (isXMLLetter(c) || isXMLDigit(c))

so "false" results are not uncommon.

We could get REALLY fast results by creating a lookup table beforehand and 
accessing lookup[c].  Timings show that 2x to 12x faster still (faster for
higher unicode values).  The table can store int values where bits 
represent if that char is appropriate for various uses (start letter, 
name letter, etc).  Then all calls are a straight lookup.  To save 
memory we could create an int[256] table until a character came in 
above that, at which point we'd populate the full table, so for 
non-Unicode use it'd only eat 256 ints of memory (1K).  Timings show
it takes 80ms to create a 64K table with one bit depth.  The lookup 
isn't our bottleneck right now though, charAt() and indexOf() are, so 
this is just a cool idea for the future.


File: Verifier.java    	Status: Up-to-date

   Working revision:	1.17	Tue Dec 19 06:58:36 2000
   Repository revision:	1.17	/home/cvspublic/jdom/src/java/org/jdom/Verifier.java,v

   Existing Tags:
	start                    	(revision:
	jdom                     	(branch: 1.1.1)

More information about the jdom-commits mailing list