RFR: 8306075: Micro-optimize Enum.hashCode

Aleksey Shipilev shade at openjdk.org
Mon Apr 17 13:03:35 UTC 2023


On Mon, 17 Apr 2023 12:14:55 GMT, Pavel Rappo <prappo at openjdk.org> wrote:

>> Improve the speed of Enum.hashCode by caching the identity hashcode on first use. I've seen an application where Enum.hashCode is a hot path, and this is fairly simple speedup. The memory overhead is low; in enums with no extra fields there is already a 4-byte space due to alignment so this new field can slot in 'for free'. In other cases, the singleton nature of enum values means that the number of total instances is typically very low, so a small per-instance overhead is not a concern.
>> 
>> Please see more discussion/explanation in the [original enhancement request](https://bugs.openjdk.org/browse/JDK-8306075).
>> 
>> ### Benchmark
>> 
>> 
>> 
>> Before:
>> 
>> Benchmark              Mode  Cnt  Score   Error  Units
>> # Intel Cascade lake
>> EnumHashCode.constant  avgt   15  1.602 ± 0.011  ns/op
>> EnumHashCode.field     avgt   15  1.681 ± 0.014  ns/op
>> # Arm Neoverse N1
>> EnumHashCode.constant  avgt   15  1.642 ± 0.033  ns/op
>> EnumHashCode.field     avgt   15  1.717 ± 0.059  ns/op
>> 
>> 
>> 
>> After:
>> 
>> Benchmark              Mode  Cnt  Score   Error  Units
>> # Intel Cascade lake
>> EnumHashCode.constant  avgt   15  0.479 ± 0.001  ns/op
>> EnumHashCode.field     avgt   15  0.799 ± 0.002  ns/op
>> # Arm Neoverse N1
>> EnumHashCode.constant  avgt   15  0.802 ± 0.002  ns/op
>> EnumHashCode.field     avgt   15  1.059 ± 0.056  ns/op
>> 
>> 
>> Using `-prof perfasm` on the benchmark, we can compare the generated code for x86_64:
>> 
>> Before:
>> 
>> │ 0x00007fae4868dd17:   lea    (%r12,%r10,8),%rsi           ;*getfield e {reexecute=0 rethrow=0 return_oop=0}
>> │                                                           ; - org.sample.EnumHashCode::field at 1 (line 24)
>> │                                                           ; - org.sample.jmh_generated.EnumHashCode_field_jmhTest::field_avgt_jmhStub at 17 (line 186)
>> │ 0x00007fae4868dd1b:   mov    (%rsi),%r10
>> │ 0x00007fae4868dd1e:   mov    %r10,%r11
>> │ 0x00007fae4868dd21:   and    $0x3,%r11
>> │ 0x00007fae4868dd25:   cmp    $0x1,%r11
>> │ 0x00007fae4868dd29:   jne    0x00007fae4868dcc6
>> │ 0x00007fae4868dd2b:   shr    $0x8,%r10
>> │ 0x00007fae4868dd2f:   mov    %r10d,%eax
>> │ 0x00007fae4868dd32:   and    $0x7fffffff,%eax
>> │ 0x00007fae4868dd37:   test   %eax,%eax
>> │ 0x00007fae4868dd39:   je     0x00007fae4868dcc6           ;*invokespecial hashCode {reexecute=0 rethrow=0 return_oop=0}
>> │                                                           ; - java.lang.Enum::hashCode at 1 (line 175)
>> 
>> 
>> This is the normal Object.hashCode intrinsic, which involves reading the object header, extracting the hash code and handling two slow-path cases (displaced object header, hash not initialized).
>> 
>> After:
>> 
>> 
>>   │  0x00007f550068e3b4:   mov    0x10(%r12,%r10,8),%r8d  <-- read the hash field
>>   │  0x00007f550068e3b9:   test   %r8d,%r8d               <-- if (hash == 0)
>>   │  0x00007f550068e3bc:   je     0x00007f550068e413      <-- slow init path, only taken on first use
>> 
>> 
>> Thanks @shipilev for help with the implementation and interpreting the generated code.
>
> src/java.base/share/classes/java/lang/Enum.java line 175:
> 
>> 173:      *
>> 174:      * @implNote Once initialized, the field value does not change.
>> 175:      * Hotspot's identity hash code generation also never returns zero
> 
>> Hotspot's identity hash code generation also never returns zero
> 
> Isn't that behavior VM-specific? Also, where is it documented?

Yes, it is implementation-specific, that is why it says "Hotspot's identity hash code". The relevant code blocks are https://github.com/openjdk/jdk/blob/cc60f2ff3f16bdb04917e09cb87f09bd544f1f8b/src/hotspot/share/oops/markWord.hpp#L231-L233 (property) and https://github.com/openjdk/jdk/blob/cc60f2ff3f16bdb04917e09cb87f09bd544f1f8b/src/hotspot/share/runtime/synchronizer.cpp#L826-L827 (invariant).

> src/java.base/share/classes/java/lang/Enum.java line 191:
> 
>> 189:         int hc = hash;
>> 190:         if (hc == 0) {
>> 191:             hc = hash = System.identityHashCode(this);
> 
> Why not `hc = hash = super.hashCode()`?

Saves the virtual call, makes for a simpler intrinsic path (no need to handle NPE and fold away, for example), since we know the super-class is already `java.lang.Object`. Unless I miss something else here...

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13491#discussion_r1168654218
PR Review Comment: https://git.openjdk.org/jdk/pull/13491#discussion_r1168656186


More information about the core-libs-dev mailing list