RFR: 8310813: Simplify and modernize equals, hashCode, and compareTo for BigInteger [v3]

Pavel Rappo prappo at openjdk.org
Thu Jul 20 15:32:09 UTC 2023


On Mon, 10 Jul 2023 17:07:28 GMT, Joe Darcy <darcy at openjdk.org> wrote:

>> LGTM; I assume the comment about aarch64 perf was a 10% improvement.
>
>> Here's some explanation for the recent commits I've added since @RogerRiggs reviewed this PR.
>> 
>>     1. Since `BigInteger.hashCode()` is unspecified, we can change it. Again: I think that the proposed implementation is no worse in hashing quality than the current one; if you disagree, please voice your concerns.
>> 
>> 
>> FWIW, we can keep the existing `BigInteger.hashCode()` values whilst still improving the implementation, using JDK-internal support:
>> 
>> ```
>> @Override
>> public int hashCode() {
>>     return ArraysSupport.vectorizedHashCode(mag, 0, mag.length, 1,
>>             ArraysSupport.T_INT) * signum;
>> }
>> ```
>> 
>> AFAIU, such an implementation would always yield exactly the same values the current (i.e. mainline) implementation does. But that could be a little slower than the original proposal, especially for a smaller BigInteger.
>> 
>> The key thing that allows to keep the current hash-code values in the above implementation is that `1` argument, which is the initial hash-code value, which cannot be specified in `Arrays.hashCode`. Unfortunately, we don't have mid-layer methods in between Arrays.hashCode and ArraysSupport.vectorizedHashCode like that of Arrays.mismatch and ArraysSupport.vectorizedMismatch. It's either all the null check but short-circuits or unconditional vectorization but the initial value. I wonder if we could consider `ArraysSupport.hashCode($type[] array, int fromIndex, int length, int initialValue)` overloads, which could be useful beyond BigInteger, as I've already seen in JDK. Contributors to ArraysSupport, @PaulSandoz, @ChrisHegarty, @cl4es, @stuart-marks; thoughts?
>> 
>>     2. Maybe surprisingly, but we don't have microbenchmarks for BigInteger's equals, hashCode, and compareTo. While I don't know how often people use the former two methods, I reckon, the latter method is essential. Anyway, I added benchmarks to cover all three. Note: benchmark for hashCode shows only its performance, not its hashing quality. Again: if you think the current version is in any way worse than the mainline version, please voice your concerns.
>> 
>> 
>> AFAIK, the biggest consumer of BigInteger in JDK is security area. So, I assume a good way to judge this change is to run security benchmarks to make sure they haven't slipped.
> 
> I would be in favor of keeping the current hash behavior for now, even if it is updated in subsequent work. Users have had more than two decades to become, accidentally, reliant on the hash algorithms.

@jddarcy would you be okay with the change and the test in [d087a59](https://github.com/openjdk/jdk/pull/14630/commits/d087a592c8996e90a5754f023c2eac3a194d007f)?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14630#issuecomment-1644134205


More information about the core-libs-dev mailing list