RFR: 8362564: hotspot/jtreg/compiler/c2/TestLWLockingCodeGen.java fails on static JDK on x86_64 with AVX instruction extensions [v2]
Jiangli Zhou
jiangli at openjdk.org
Wed Jul 30 17:40:01 UTC 2025
On Tue, 29 Jul 2025 22:14:11 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>>> @vnkozlov Do you refer to the PR testing? It's https://github.com/openjdk/jdk/pull/26395/checks, thanks.
>>
>> Yes, that. I did not realize that GitHub removes it from main PR tab after integration.
>>
>> Good, it was tested. I think GitHub actions use AVX512 machines. But I remember that static build testing was excluded.
>>
>> First, I am fine with fix you pushed. But it is strange.
>>
>> There should be no difference in stubs sizes running with regular JDK on avx512 machine and static JDK. I don't see how we don't trigger this issue in our testing - we regularly running on AVX512 machines.
>>
>> Do you have additional code somewhere in macro assembler for static build? Is it possible to compare assembler code?
>> I see you posted sizes in bug report. Can you compare code for `StubRoutines::forward_exception()` which has smallest size to see what is difference.
>
>> Can you compare code for StubRoutines::forward_exception() which has smallest size to see what is difference.
>
> Please run both, regular and static JDK, on AVX512 machine.
@vnkozlov I collected some info with `forward_exception_stub` on both regular and static JDK `fastdebug` binaries. For the `forward_exception_stub`, the generated instructions are not really affected with or without AVX extensions and the instruction sizes are the same on machines with or without AVX extensions when running the same JDK binary.
`forward_exception_stub` size is different when running on regular JDK and static JDK. Looking at the generated instructions in the stub, the most differences are due to call instructions. On static JDK, the target address needs to be moved into a register first then call using register, e.g.:
9c: 48 b8 00 3b 77 6c b3 movabs rax,0x55b36c773b00
a3: 55 00 00
a6: ff d0 call rax
On regular JDK, the target address can be encoded within the `call` instruction, e.g:
95: e8 46 46 e1 1a call 0x1ae146e0
I recall @rasbold had suspected that could be what was causing the differences between static and regular JDKs, when we briefly discussed about the issue last week.
Here are the disassembled `forward_exception_stub`:
**static-jdk**
0: 49 83 7f 08 00 cmp QWORD PTR [r15+0x8],0x0
5: 0f 85 79 00 00 00 jne 0x84
b: 48 81 ec 80 00 00 00 sub rsp,0x80
12: 48 89 44 24 78 mov QWORD PTR [rsp+0x78],rax
17: 48 89 4c 24 70 mov QWORD PTR [rsp+0x70],rcx
1c: 48 89 54 24 68 mov QWORD PTR [rsp+0x68],rdx
21: 48 89 5c 24 60 mov QWORD PTR [rsp+0x60],rbx
26: 48 89 6c 24 50 mov QWORD PTR [rsp+0x50],rbp
2b: 48 89 74 24 48 mov QWORD PTR [rsp+0x48],rsi
30: 48 89 7c 24 40 mov QWORD PTR [rsp+0x40],rdi
35: 4c 89 44 24 38 mov QWORD PTR [rsp+0x38],r8
3a: 4c 89 4c 24 30 mov QWORD PTR [rsp+0x30],r9
3f: 4c 89 54 24 28 mov QWORD PTR [rsp+0x28],r10
44: 4c 89 5c 24 20 mov QWORD PTR [rsp+0x20],r11
49: 4c 89 64 24 18 mov QWORD PTR [rsp+0x18],r12
4e: 4c 89 6c 24 10 mov QWORD PTR [rsp+0x10],r13
53: 4c 89 74 24 08 mov QWORD PTR [rsp+0x8],r14
58: 4c 89 3c 24 mov QWORD PTR [rsp],r15
5c: 48 be eb 15 00 66 24 movabs rsi,0x7f24660015eb
63: 7f 00 00
66: 48 8b d4 mov rdx,rsp
69: 48 bf 20 95 e3 e0 1a movabs rdi,0x561ae0e39520
70: 56 00 00
73: 48 83 e4 f0 and rsp,0xfffffffffffffff0
77: 48 b8 d0 cc 61 e2 1a movabs rax,0x561ae261ccd0
7e: 56 00 00
81: ff d0 call rax
83: f4 hlt
84: 48 8b 3c 24 mov rdi,QWORD PTR [rsp]
88: 48 8b f7 mov rsi,rdi
8b: 49 8b ff mov rdi,r15
8e: 40 f6 c4 0f test spl,0xf
92: 0f 84 19 00 00 00 je 0xb1
98: 48 83 ec 08 sub rsp,0x8
9c: 48 b8 00 bb 84 e2 1a movabs rax,0x561ae284bb00
a3: 56 00 00
a6: ff d0 call rax
a8: 48 83 c4 08 add rsp,0x8
ac: e9 0c 00 00 00 jmp 0xbd
b1: 48 b8 00 bb 84 e2 1a movabs rax,0x561ae284bb00
b8: 56 00 00
bb: ff d0 call rax
bd: 48 8b d8 mov rbx,rax
c0: 5a pop rdx
c1: 49 8b 47 08 mov rax,QWORD PTR [r15+0x8]
c5: 49 c7 47 08 00 00 00 mov QWORD PTR [r15+0x8],0x0
cc: 00
cd: 48 85 c0 test rax,rax
d0: 0f 85 79 00 00 00 jne 0x14f
d6: 48 81 ec 80 00 00 00 sub rsp,0x80
dd: 48 89 44 24 78 mov QWORD PTR [rsp+0x78],rax
e2: 48 89 4c 24 70 mov QWORD PTR [rsp+0x70],rcx
e7: 48 89 54 24 68 mov QWORD PTR [rsp+0x68],rdx
ec: 48 89 5c 24 60 mov QWORD PTR [rsp+0x60],rbx
f1: 48 89 6c 24 50 mov QWORD PTR [rsp+0x50],rbp
f6: 48 89 74 24 48 mov QWORD PTR [rsp+0x48],rsi
fb: 48 89 7c 24 40 mov QWORD PTR [rsp+0x40],rdi
100: 4c 89 44 24 38 mov QWORD PTR [rsp+0x38],r8
105: 4c 89 4c 24 30 mov QWORD PTR [rsp+0x30],r9
10a: 4c 89 54 24 28 mov QWORD PTR [rsp+0x28],r10
10f: 4c 89 5c 24 20 mov QWORD PTR [rsp+0x20],r11
114: 4c 89 64 24 18 mov QWORD PTR [rsp+0x18],r12
119: 4c 89 6c 24 10 mov QWORD PTR [rsp+0x10],r13
11e: 4c 89 74 24 08 mov QWORD PTR [rsp+0x8],r14
123: 4c 89 3c 24 mov QWORD PTR [rsp],r15
127: 48 be b6 16 00 66 24 movabs rsi,0x7f24660016b6
12e: 7f 00 00
131: 48 8b d4 mov rdx,rsp
134: 48 bf 21 81 ee e0 1a movabs rdi,0x561ae0ee8121
13b: 56 00 00
13e: 48 83 e4 f0 and rsp,0xfffffffffffffff0
142: 48 b8 d0 cc 61 e2 1a movabs rax,0x561ae261ccd0
149: 56 00 00
14c: ff d0 call rax
14e: f4 hlt
14f: ff .byte 0xff
**regular jdk**
0: 49 83 7f 08 00 cmp QWORD PTR [r15+0x8],0x0
5: 0f 85 72 00 00 00 jne 0x7d
b: 48 81 ec 80 00 00 00 sub rsp,0x80
12: 48 89 44 24 78 mov QWORD PTR [rsp+0x78],rax
17: 48 89 4c 24 70 mov QWORD PTR [rsp+0x70],rcx
1c: 48 89 54 24 68 mov QWORD PTR [rsp+0x68],rdx
21: 48 89 5c 24 60 mov QWORD PTR [rsp+0x60],rbx
26: 48 89 6c 24 50 mov QWORD PTR [rsp+0x50],rbp
2b: 48 89 74 24 48 mov QWORD PTR [rsp+0x48],rsi
30: 48 89 7c 24 40 mov QWORD PTR [rsp+0x40],rdi
35: 4c 89 44 24 38 mov QWORD PTR [rsp+0x38],r8
3a: 4c 89 4c 24 30 mov QWORD PTR [rsp+0x30],r9
3f: 4c 89 54 24 28 mov QWORD PTR [rsp+0x28],r10
44: 4c 89 5c 24 20 mov QWORD PTR [rsp+0x20],r11
49: 4c 89 64 24 18 mov QWORD PTR [rsp+0x18],r12
4e: 4c 89 6c 24 10 mov QWORD PTR [rsp+0x10],r13
53: 4c 89 74 24 08 mov QWORD PTR [rsp+0x8],r14
58: 4c 89 3c 24 mov QWORD PTR [rsp],r15
5c: 48 be eb 15 00 22 6c movabs rsi,0x7f6c220015eb
63: 7f 00 00
66: 48 8b d4 mov rdx,rsp
69: 48 bf 0e 71 3c 3c 6c movabs rdi,0x7f6c3c3c710e
70: 7f 00 00
73: 48 83 e4 f0 and rsp,0xfffffffffffffff0
77: e8 44 5a 90 1b call 0x1b905ac0
7c: f4 hlt
7d: 48 8b 3c 24 mov rdi,QWORD PTR [rsp]
81: 48 8b f7 mov rsi,rdi
84: 49 8b ff mov rdi,r15
87: 40 f6 c4 0f test spl,0xf
8b: 0f 84 12 00 00 00 je 0xa3
91: 48 83 ec 08 sub rsp,0x8
95: e8 46 46 c1 1b call 0x1bc146e0
9a: 48 83 c4 08 add rsp,0x8
9e: e9 05 00 00 00 jmp 0xa8
a3: e8 38 46 c1 1b call 0x1bc146e0
a8: 48 8b d8 mov rbx,rax
ab: 5a pop rdx
ac: 49 8b 47 08 mov rax,QWORD PTR [r15+0x8]
b0: 49 c7 47 08 00 00 00 mov QWORD PTR [r15+0x8],0x0
b7: 00
b8: 48 85 c0 test rax,rax
bb: 0f 85 72 00 00 00 jne 0x133
c1: 48 81 ec 80 00 00 00 sub rsp,0x80
c8: 48 89 44 24 78 mov QWORD PTR [rsp+0x78],rax
cd: 48 89 4c 24 70 mov QWORD PTR [rsp+0x70],rcx
d2: 48 89 54 24 68 mov QWORD PTR [rsp+0x68],rdx
d7: 48 89 5c 24 60 mov QWORD PTR [rsp+0x60],rbx
dc: 48 89 6c 24 50 mov QWORD PTR [rsp+0x50],rbp
e1: 48 89 74 24 48 mov QWORD PTR [rsp+0x48],rsi
e6: 48 89 7c 24 40 mov QWORD PTR [rsp+0x40],rdi
eb: 4c 89 44 24 38 mov QWORD PTR [rsp+0x38],r8
f0: 4c 89 4c 24 30 mov QWORD PTR [rsp+0x30],r9
f5: 4c 89 54 24 28 mov QWORD PTR [rsp+0x28],r10
fa: 4c 89 5c 24 20 mov QWORD PTR [rsp+0x20],r11
ff: 4c 89 64 24 18 mov QWORD PTR [rsp+0x18],r12
104: 4c 89 6c 24 10 mov QWORD PTR [rsp+0x10],r13
109: 4c 89 74 24 08 mov QWORD PTR [rsp+0x8],r14
10e: 4c 89 3c 24 mov QWORD PTR [rsp],r15
112: 48 be a1 16 00 22 6c movabs rsi,0x7f6c220016a1
119: 7f 00 00
11c: 48 8b d4 mov rdx,rsp
11f: 48 bf 8b da 46 3c 6c movabs rdi,0x7f6c3c46da8b
126: 7f 00 00
129: 48 83 e4 f0 and rsp,0xfffffffffffffff0
12d: e8 8e 59 90 1b call 0x1b905ac0
132: f4 hlt
133: ff .byte 0xff
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26395#issuecomment-3137262447
More information about the hotspot-dev
mailing list