RFR: 8306075: Micro-optimize Enum.hashCode [v2]
Viktor Klang
duke at openjdk.org
Tue Apr 18 19:26:46 UTC 2023
On Tue, 18 Apr 2023 09:02:34 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>>> > > Why isn't `Enum::hashCode` simply doing `return ordinal;`?
>>> >
>>> >
>>> > See https://bugs.openjdk.org/browse/JDK-8050217
>>>
>>> Thanks! If there are apps where `Enum::hashCode` is performance sensitive then run-to-run stability may be a stronger argument than @stuart-marks et al argues it is here, though there might be some unspoken arguments about how it might affect iteration order et.c...
>>
>> Identity hash code might theoretically be stable from run-to-run, but it's not like it's very stable in the first place. The hash code is generated using some thread-local state, so the value depends on how many times an identity hash code has been generated on the same thread before the hash code of the enum constant is generated (when `hashCode` is first called)
>>
>> So, using ordinal as a hash code would, AFAICS, be no more unstable than the current implementation: if the code is altered (re-ordering the constants. Or calling hashCode for the first time at a different point in the application), the identity hash code can change. However, using the ordinal has the benefit of simplicity as well. (if anything, using the ordinal would be _more_ stable, since it is only affected by the order of the constants, rather than by other code that runs before the identity hash code is generated).
>
>> (if anything, using the ordinal would be _more_ stable, since it is only affected by the order of the constants, rather than by other code that runs before the identity hash code is generated).
>
> Here lies the major original concern -- that I share -- about hooking up `ordinal` as `hashCode`, I think. It opens up enum for the similar sort of algorithmic collisions, perhaps even deliberate ones, like we had with `String`-s. Arguably, switching enum hashcode from IHC to ordinal is a step backwards.
>
> Of course, as John says above, this might change when Valhalla arrives, but in that case we would give up the collision-resistant enum hashcodes for clear benefit of enums being flattened: it would be the "one step back, two steps forward" kind of deal. Today, however, there seem to be to reason to give up this resistance without getting anything in return. The "return" might have been better performance, but this PR eliminates that part as well.
@shipilev @rose00 Yes, ideally hashCode of enums should only depend on type + ordinal—as I presume that it is desirable that different enums of different types but bearing the same ordinal values shouldn't collide when mixed in a `HashMap<Enum,Object>`.
Given "on the wire" representation the information regarding type would either need to be out-of-bound or transmitted as a part of the value.
In such a case, would it not be "better" to base the HC on getClass().getName().hashCode() which is specced and stable mixed with the ordinal to improve distribution over the 32-bit space?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13491#issuecomment-1513682849
More information about the core-libs-dev
mailing list