Does UseFPUForSpilling intend to spill a GPR to XMM?
Liu, Xin
xxinliu at amazon.com
Tue Oct 19 01:04:30 UTC 2021
Hello, Experts,
We recently encounter an ABI issue of XMM0. Even though it only happens
on jdk8u windows x86(32bits) so far, it raises my concern about
'UseFPUForSpilling' for both x86 and x86_64. Does UseFPUForSpilling
intend to spills GPR to XMM registers? I come from JDK-6978249, but I
can't the original webrev.
I don't think XMM registers are saved across function calls in any ABIs.
Only XMM6-XMM15 are saved by the callee on Microsoft platforms.
https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160#callercallee-saved-registers
If nobody saves XMM0~3, how come C2 register allocation uses them as
spilling destination? It seems possible on AMD64 as well, but it's rarer
than x86 given the fact AMD64 has more GPRs.
https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1388
eg. This is what I have seen on Windows x86.
07c B10: # B26 B11 <- B9 Freq: 0.999992
07c movdl XMM0, EBX # spill (xliu: EBX store an OOP)
080 MOV [ESP + #20],EDI
084 MOV EBX,[ECX + #136] ! Field:
java/awt/Component.componentOrientation
08a TEST EBX,EBX
08c Je B26 P=0.000001 C=-1.000000
----
170 B16: # B40 B17 <- B15 B24 Freq: 0.999991
170 movdl EBX, XMM0 # spill
174 MOV EBX,[EBX + #56] ! Field: javax/swing/plaf/basic
/BasicSliderUI.thumbRect (xliu: segment fault here EBX=0)
177 MOV EDI,[EBX + #16] # int ! Field: java/awt/Rectangle.width
17a NullCheck EBX
Between BB10 and BB16, the control goes to convD2I_reg_reg, which calls
SharedRuntime::d2l in the slow path. XMM0 is used as return value, so it
is clobbered.
So the FPU of 'UseFPUForSpilling' doesn't just refer to intel x87 but
also include SSE/AVX units, right? If it does intend to use X/Y/ZMM
registers as Spilling destination, is there any mechanism to protect
them from runtime calls on x86/x86_64? In both System V and Microsoft
ABIs, XMM0~3 could be used for both argument passing and return value,
right?
I see other runtime code stubs such as arraycopy and crc32 use XMM
registers.
Btw, there's a hidden mechanism somewhere which prevents c2 from working
on x86_32. Even thought I has a server VM of jdk17, it only uses c1 to
compile methods. Could you tell me what it is?
Thanks,
--lx
More information about the hotspot-compiler-dev
mailing list