Review Request: BigInteger patch for efficient multiplication and division (#4837946)
Alan Eliasen
eliasen at mindspring.com
Sat May 4 12:13:52 UTC 2013
On 05/02/2013 06:37 PM, Brian Burkhalter wrote:
> On May 2, 2013, at 5:02 AM, Alan Bateman wrote:
>
>> On 02/05/2013 02:34, Tim Buktu wrote:
>>> :
>>>
>>> Alan is working on an improved BigInteger.toString() that should
>>> be dramatically faster for large numbers. What's the deadline for
>>> getting this in? Thanks,
>>>
>>> Tim
>> Here's the latest milestone dates:
>>
>> http://openjdk.java.net/projects/jdk8/milestones
>>
>> Given the size of the patch then it would be great to get it in as
>> early as possible. With the review effort then I assume Feature
>> Complete is too tight although I know Brian Burkhalter is anxious
>> to get it in as soon as possible. I can't comment on the
>> BigInteger.toString work to know how big this is.
>
> I think that's the best answer that can be given: as soon as
> possible. The 4837946 patch is far from simple to comprehend to say
> the least and then there is the testing so it's not going to be fast.
> If the toString() work is ready early enough to review then perhaps
> we can handle it, but I don't want to make anyone hurry to complete
> something and then say we cannot do it.
Brian:
If you're feeling overwhelmed by the magnitude of the changes to
BigInteger, I would very strongly suggest that you review it in stages.
This e-mail proposes stages for the review of this patch.
Since I first posted these patches over 5 years ago, we were asked to
make patches as small as possible so they would get reviewed more
quickly. That's why my patches focused on what was a minimal but
necessary subset: making multiply and pow work *much* faster with very
simple, understandable, well-known, hard-to-get-wrong routines and
simple optimizations that built on existing routines but gave very
significant performance increases. (Implementing Karatsuba and
Toom-Cook multiplication routines in Java is often given as a homework
problem in undergrad Algorithms courses. Understanding their
correctness requires nothing more than first-year Algebra.)
Today, I finished porting my recursive Schoenhage toString routines
that I've been using for years in my programming language "Frink" (
http://futureboy.us/frinkdocs/ ). They are drastically, stunningly
faster than the existing BigInteger.toString algorithms. I have run
these through my enormous regression tests which produce over 232 GB of
output with no differences.
I ram these (very large) regression tests which test multiplication
and pow and toString functions in BigInteger. (They produce about 232
gigabytes of output, which I compare against known good output that was
produced by previous versions of Java and by the Kaffe VM which used the
GMP library. This tells you how long I've been working on this; Kaffe
removed support for GMP over 4 years ago.)
It should be known that I would need to further increase the size of
these tests to make a significant test of the very large numbers that
trigger Schönhage-Strassen (SS) multiplication. Currently, the smallest
numbers that will trigger SS multiplication are both operands >247,000
bits, or about 7718 ints. SS multiplication is used at all times when
both operands are 1,249,000 bits or more. These are biggish numbers, and
my tests currently don't go out that far. I can vouch strongly for the
correctness of Karatsuba and Toom-Cook multiplication, though.
Keep in mind that torture-testing Schönhage-Strassen multiplication
and comparing against the runs of Java 1.7 or earlier might take a
*very* long time because the earlier versions of
BigInteger.multiply(BigInteger) are so slow in multiplications of the
size where SS multiplication is used.
I can personally vouch for having very strongly tested the parts of
multiplication and pow that I wrote, which include:
* Karatsuba multiplication
* 3-way Toom-Cook multiplication
* Karatsuba squaring
* Toom-Cook squaring
* Revised pow() function
While I have been using Tim's Schönhage-Strassen multiplication and
Burnickel-Ziegler division routines in my daily work, and have not
detected failures, I have not written explicit tests for division at
all, and my number of very large integer divisions has been small. I
have implicitly used his division routines in my base conversion
routines for large numbers, and have found no errors.
My multiplication tests were written to test general multiplication,
but also to very strongly test around the regions at which Karatsuba and
Toom-Cook have edge cases. Since I don't know the Schönhage-Strassen
algorithms, I don't know where they might have particular edge cases, so
I haven't written any specific tests for them.
Brian, would it be possible to make BigInteger.square() public? It
would make it possible for end-users to write algorithms that performed
notably faster. If not, a possible (much less optimal) optimization
would be for multiply to detect that its arguments are equal (either
using == which would be fast or .equals or .compareTo which may be much
slower if the numbers are large) and call squaring routines instead of
doing the naive multiply.
If reviewing time is limited, I might suggest performing the reviews
and integration in a few steps:
Step 1) would be integrating the simple, high-benefit, low-risk
changes that I made:
* Karatsuba multiplication
* 3-way Toom-Cook multiplication
* Karatsuba squaring
* Toom-Cook squaring
* Revised pow() function
these require the helper functions:
* getLower(int) and
* getUpper(int) for Karatsuba which do efficient slices of numbers
* getToomSlice() for Toom-Cook slicing to do efficient slicing of
numbers.
* exactDivideBy3: A function to do divisions by modular
multiplication, which is at least 17 times faster on most architectures.
Used in Toom-Cook3 multiplication.
It should be noted that the pow() function performs some arguments
(e.g. when the base is a power of 2) millions or billions of times
faster than current Java and would alone drastically improve the
performance of many programs.
Step 2) incorporating my toString changes. It's useless to work with
these very big numbers if you can't output them in reasonable time. And
right now toString() will be the bottleneck for most programs as it's
approximately O(n^2.0634) in my tests. With the improved division
routines that Tim Buktu wrote, and my recursive base conversion
algorithms, I can make these perform in O(n^1.3596) or better. This
means, as a practical matter, that printing the largest known Mersenne
number in base 10 is reduced from 18 hours to 150 seconds. (I can
provide constants to the asymptotes if desired.)
The bug for improving toString is:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4641897
I considered toString() slightly more controversial because to
perform repeated conversions efficiently it is very useful to allocate a
cache of "helper" values of 10^2^n to speed up calculations. This had
memory usage implications, synchronization implications, and questions
about conversions to different bases (say, to convert to base 5, you
also need a cache of 5^2^n, etc.) I didn't want this to delay the
acceptance of the simple and noncontroversial multiplication and pow
routines which had no external impact and didn't touch other files like
MutableBigInteger, SignedMutableBigInteger, etc.
My complete patch contains very much faster, simple Schoenhage
recursive base conversion routines for toString.
By the way, the synchronization for the new toString() routines could
be rewritten to use Future and other synchronization mechanisms. In any
case. the current mechanism will almost always be faster.
While I have been using Tim's Schönhage-Strassen multiplication and
Burnickel-Ziegler division routines in my daily work, and have not
detected failures, I have not written explicit tests for division at
all, and my number of very large integer divisions has been small. My
multiplication tests were written to test general multiplication, but
also to very strongly test around the regions at which Karatsuba and
Toom-Cook have edge cases. Since I don't know the Schönhage-Strassen
algorithms, I don't know where they might have particular edge cases, so
I haven't written any specific tests for them.
Brian, would it be possible to make BigInteger.square() public? It
would make it possible for end-users to write algorithms that performed
notably faster. If not, a possible (much less optimal) optimization
would be for multiply() to detect that its arguments are equal (either
using == which would be fast or .equals or .compareTo which may be much
slower if the numbers are large) and call squaring routines instead of
doing the naive multiply.
Step 3) would involve faster division: The Burnickel-Ziegler and
Barrett division routines that Tim wrote. These are important for
making base conversion faster, and making division reasonably fast.
(For many programs, division speed is the limiting factor. This limits
the performance of BigDecimal too.) They are complicated.
Step 4) integrating what I believe are the most tricky
routines: Schoenhage-Strassen (SS) multiplication. I respect Tim
Buktu's programming skills, but I also know that SS multiplication is
hard to get right, even for super-geniuses. The GMP team historically
released versions of SS multiplication that were wrong for a small
subset of arguments. This stuff is hard.
My latest best version of all of these routines is located at:
http://futureboy.us/temp/BigInteger.java
This passes all of my regression tests for multiply in the Karatsuba
and Toom-Cook regions, and all base conversion test. It does not test
multiplication in the Schoenhage-Strassen regions, nor does it test
large division.
--
Alan Eliasen
eliasen at mindspring.com
http://futureboy.us/
More information about the core-libs-dev
mailing list