BigInteger Burnikel-Ziegler division algorithm crossover heuristic

Sat Nov 23 12:04:57 UTC 2013

Time complexity of Knuth divide a/b=q approximately the same as
complexity of school multiplication b*q and is
length(b)*length(q) = length(b) * (length(a) - length(b)).
So the new heuristic seems reasonable for me.

On Sat, Nov 23, 2013 at 1:47 AM, Brian Burkhalter <
brian.burkhalter at oracle.com> wrote:

> Hello,
>
> We've been doing fairly extensive micro-benchmark testing of the effect of
> adjusting the various algorithm crossover thresholds in BigInteger (cf.
> https://bugs.openjdk.java.net/browse/JDK-8022181). There'll be another
> post on that particular topic in the near future. This testing did however
> expose another topic for which an issue is not yet (but likely will be) on
> file, i.e., the heuristic for determining when to use Burnikel-Ziegler
> divide and conquer division instead of "standard" division.
>
> At present the test used is
>
>         if divisor_int_length < BZ_THRESHOLD or dividend_int_length <
> BZ_THRESHOLD then
>                 use standard division
>         else
>                 use B-Z division
>
> (where BZ_THRESHOLD is currently 50).
>
> Let {N,D} represent a dividend-divisor pair with the int-length of the
> former being N and that of the latter being D. The following cases were
> among those initially tested: {T,2T}, {T,T+2}, {T,T}, {T+2,T}, and {2T,T}
> where T is the BZ threshold. In all but the last case, where the dividend
> is twice the length of the divisor, a (sometimes serious) performance
> regression was observed.
>
> The implementation was reviewed along with the original paper [1]. The
> implementation is accurate with respect to the paper as had already been
> verified some months ago [2]. It was observed however that something
> interesting happens for the case where the int-length of the dividend A is
> equal to that of the divisor B. For this situation, the algorithm parameter
> t (step 5 in algorithm 3 in the paper) becomes 2. This causes A to be
> virtually padded out to a 2T-length value [0,A]. The net result is that the
> BZ algorithm ends up being applied in effect to a pair {2T,T} which is
> significantly slower than using the standard algorithm on the pair {T,T}.
> This appears to be primarily due to the algorithm overhead rather than a
> length dependent problem. For the {T,T} case, the following recursive call
> sequence of significant operations ensues:
>
> divide
>         divideBZ
>                 divide2n1n
>                         divide3n2n
>                                 divide2n1n
>                                         divideKnuth
>                                 multiply
>                                 subtract
>                         divide3n2n
>                                 divide2n1n
>                                         divideKnuth
>                                 multiply
>                                 subtract
>                 add
>
> Despite the fact that some of the method calls reduce to simple operations
> like dividing zero by non-zero, dividing something by itself, or
> multiplying something by zero or one, the algorithm overhead causes the
> performance to be worse than had this sequence been used:
>
> divide
>         divideKnuth
>
> Through micro-benchmark testing it was discovered that, given the original
> BZ_THRESHOLD criterion is met, for some value C the BZ algorithm will start
> to outperform the standard algorithm for the case {T+C,T}, i.e., the
> dividend is C ints longer than the divisor. Testing supports the value of
> the offset C being independent of the value of the BZ threshold. This
> suggests that the following heuristic would be more appropriate for this
> algorithm:
>
>         if divisor_int_length < BZ_THRESHOLD or dividend_int_length -
> divisor_int_length < BZ_OFFSET_THRESHOLD then
>                 use standard division
>         else
>                 use B-Z division
>
> Any comments on this alternative heuristic would be appreciated as I would
> not be in the least bit surprised to have missed some fine point altogether.
>
> Thanks,
>
> Brian
>
> [1] Christoph Burnikel and Joachim Ziegler, "Fast Recursive Division",
> Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022,
> http://www.mpi-sb.mpg.de/~ziegler/TechRep.ps.gz.
> [2]
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2013-July/018922.html