RFR: 8334755: Asymptotically faster implementation of square root algorithm [v32]

Raffaello Giulietti rgiulietti at openjdk.org
Thu Jul 18 16:09:36 UTC 2024


On Thu, 18 Jul 2024 15:55:08 GMT, fabioromano1 <duke at openjdk.org> wrote:

>> src/java.base/share/classes/java/math/MutableBigInteger.java line 2032:
>> 
>>> 2030:                 s = s1;
>>> 2031:         }
>>> 2032:         return s;
>> 
>> Experimentally, the following seems a bit faster.
>> In some cases, it avoids a full multiplication, some updates, and has one less test.
>> I hope it is correct as well ;-)
>> 
>> Suggestion:
>> 
>>         long s = (long) Math.sqrt(x >= 0 ? x : x + 0x1p64);
>>         long s2 = s * s;  // overflows iff s = 2^32
>>         if (s == 1L << 32
>>                 || Long.compareUnsigned(x, s2) < 0) {
>>             return s - 1;
>>         }
>>         if (Long.compareUnsigned(x, s2 + 2 * s) <= 0) {  // x < (s + 1)^2
>>             return s;
>>         }
>>         return s + 1;
>
>> Experimentally, the following seems a bit faster. In some cases, it avoids a full multiplication, some updates, and has one less test. I hope it is correct as well ;-)
> 
> It's a nice code, but I'm afraid that if `s == LONG_MASK` and `Long.compareUnsigned(x, s * s) >= 0`, the overflow check is unavoidable...

I guess you are concerned about an overflow in `s2 + 2 * s`?

If s = 2^32 - 1, then s2 = 2^64 - 2·2^32 + 1 and 2s = 2·2^32 - 2, without overflows.
Thus, s2 + 2s = 2^64 - 1, without overflows.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19710#discussion_r1683119623


More information about the core-libs-dev mailing list