[16] RFR(XS) 8076985: Allocation path: biased locking + compressed oops code quality

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Jul 2 00:23:56 UTC 2020


https://bugs.openjdk.java.net/browse/JDK-8076985
https://cr.openjdk.java.net/~kvn/8076985/webrev.00/

First, this is about how C2 generates code for *constant* class pointers.

A little history here. When we implemented compressed oops and class pointers we had PermGen and classes were Java 
objects. We used the same decoding/encoding code for oops and classes - we used the same register containing Heap Base 
address. It was profitable to decode constant class and reuse it [1]. Also we greatly benefited on SPARC since decoding 
32-bit constant required 4 instructions instead of up to 7 instructions to load 64-bit constant.

Now compressed class decoding is different and always takes 2 instructions on x86 [2] if either base or shift is not 0.

As result we generated 3 instructions to get full class pointer from compressed 32-bit constant (example for base = 0, 
shift = 3):

movl $0x200001d5,%r11d
movabs $0x0,%r10
lea (%r10,%r11,8),%r10

Also when we store compressed class pointer into new object header we don't use register anymore on x86 - keeping it in 
register does not help now:

movl $0x200001d5,0x8(%rax)

Aleksey suggested to have only one instruction to load full 64-bit class pointer:

movq $0x100000EA8,%r10

It frees one register and uses 10 bytes instead of up to 20 bytes of code on x86.

In JDK 9 SAP contributed nice change [3] to have choice when to use 'compressed class pointer + decoding' or full 
'64-bit constant class pointer'. It significantly simplified changes for this RFE.


I ran performance testing but did not see difference - we don't use biased locking now and as result we don't need to 
load prototype header from class. But there are other places where we need load from class.

Thanks,
Vladimir K

[1] https://bugs.openjdk.java.net/browse/JDK-6709093
http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/rev/44abbb0d4c18
To generate instead of this:
      movl    R11, narrowoop: precise klass jnt/scimark2/Random: 0x000000000083b418:Constant:exact * # compressed ptr
      movq    R10, precise klass jnt/scimark2/Random: 0x000000000083b418:Constant:exact * # ptr
      movq    R10, [R10 + #176 (32-bit)]      # ptr
      movq    [RAX], R10      # ptr
      movl    [RAX + #8 (8-bit)], R11 # compressed ptr

generate this:
      movl    R11, narrowoop: precise klass Point: 0x00000000007ad518:Constant:exact * # compressed ptr
      movq    R10, [R12 + R11 << 3 + #176] (compressed oop addressing) # ptr
      movq    [R8], R10       # ptr
      movl    [R8 + #8 (8-bit)], R11  # compressed ptr

[2] http://hg.openjdk.java.net/jdk/jdk/file/c5ed42533134/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4609

[3] https://bugs.openjdk.java.net/browse/JDK-8155729


More information about the hotspot-compiler-dev mailing list