BigInteger performance improvements

Tue Feb 5 02:16:23 UTC 2008

Joseph D. Darcy wrote:
> Yes, this is the right group :-)  As "Java Floating-Point Czar" I also
> look after BigInteger, although I haven't had much time for proactive
> maintenance there in a while.  I think using faster, more mathematically
> sophisticated algorithms in BigInteger for large values would be a fine
> change to explore, as long as the performance on small values was
> preserved and regression tests appropriate for the new algorithms were
> developed too.

   My last step for this patch is improving performance of pow() for
small numbers, which got slightly slower for some small arguments.  (But
some arguments, like those containing powers of 2 in the base, are
*much* faster.)  The other functions like multiply() are basically
unchanged for small numbers.  The new code, as you might expect,
examines the size of the numbers and runs the old "grade-school"
algorithm on small numbers and the Karatsuba algorithm on larger numbers
(with the threshhold point being determined by benchmark and experiment.)

   As to the matter of regression tests, I would presume that Sun
already has regression tests for BigInteger to make sure it gets correct
results.  Can you provide me with these, or are they in the OpenJDK
distribution already?  I can check these to make sure that they use
numbers big enough to trigger the threshholds, and if not, extend them
in the same style.  Regression tests should hopefully be a no-op,
assuming you have some already.  I'm not adding any new functionality
and no results should change--they should just run faster, of course.
It should of course be impossible to write a regression test that
"succeeds with the patch applied, and fails without the patch" like is
requested on the OpenJDK page, unless it is time-limited.

   Of course, I could extend the regression tests to test *huge* numbers
which may or may not be desired if you want the regression tests to run
in short time.  For example, a single exponentiation that takes
milliseconds in my new version takes on the order of 15 minutes with JDK
1.6.  How long are you willing to let regression tests take?  How many
combinations of arguments do you currently test?  Do you think more are
necessary?

> I'd prefer to process changes in smaller chunks rather a huge one all at
> once; however, I may be a bit slow on reviews in the near future due to
> some other openjdk work I'm leading up

   I will be submitting a patch addressing the three functions:
multiply(), pow() and (the private) square().  These are intertwined and
it would be more work and testing for both of us to separate out the
patches and apply them in 3 phases.  The patch will definitely not be huge.

   My patches are designed to be as readable and simple as possible for
this phase.  They all build on existing functions, and eschew lots of
low-level bit-fiddling, as those types of changes are harder to
understand and debug.  I think it's best to get working algorithms with
better asymptotic efficiency, as those will vastly improve performance
for large numbers, and tune them by doing more low-level bit fiddling
later.  Even without being tuned to the nth degree, the new algorithms
are vastly faster for large numbers, and identical for small numbers.

-- 
  Alan Eliasen              |  "Furious activity is no substitute
  eliasen at mindspring.com    |    for understanding."
  http://futureboy.us/      |           --H.H. Williams