Does UseFPUForSpilling intend to spill a GPR to XMM?

Tue Oct 19 01:04:30 UTC 2021

Hello, Experts,

We recently encounter an ABI issue of XMM0. Even though it only happens
on jdk8u windows x86(32bits) so far, it raises my concern about
'UseFPUForSpilling' for both x86 and x86_64. Does UseFPUForSpilling
intend to spills GPR to XMM registers? I come from JDK-6978249, but I
can't the original webrev.

I don't think XMM registers are saved across function calls in any ABIs.
Only XMM6-XMM15 are saved by the callee on Microsoft platforms.
https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160#callercallee-saved-registers

If nobody saves XMM0~3, how come C2 register allocation uses them as
spilling destination? It seems possible on AMD64 as well, but it's rarer
than x86 given the fact AMD64 has more GPRs.
https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1388

eg. This is what I have seen on Windows x86.
07c   B10: #	B26 B11 <- B9  Freq: 0.999992
07c   	movdl   XMM0, EBX	# spill (xliu: EBX store an OOP)
080   	MOV    [ESP + #20],EDI
084   	MOV    EBX,[ECX + #136] ! Field:
java/awt/Component.componentOrientation
08a   	TEST   EBX,EBX
08c   	Je    B26  P=0.000001 C=-1.000000

----

170   B16: #	B40 B17 <- B15 B24  Freq: 0.999991
170   	movdl   EBX, XMM0	# spill
174   	MOV    EBX,[EBX + #56] ! Field: javax/swing/plaf/basic
/BasicSliderUI.thumbRect (xliu: segment fault here EBX=0)
177   	MOV    EDI,[EBX + #16]	# int ! Field: java/awt/Rectangle.width
17a   	NullCheck EBX

Between BB10 and BB16, the control goes to convD2I_reg_reg, which calls
SharedRuntime::d2l in the slow path. XMM0 is used as return value, so it
is clobbered.

So the FPU of 'UseFPUForSpilling' doesn't just refer to intel x87 but
also include SSE/AVX units, right? If it does intend to use X/Y/ZMM
registers as Spilling destination, is there any mechanism to protect
them from runtime calls on x86/x86_64? In both System V and Microsoft
ABIs, XMM0~3 could be used for both argument passing and return value,
right?

I see other runtime code stubs such as arraycopy and crc32 use XMM
registers.

Btw, there's a hidden mechanism somewhere which prevents c2 from working
on x86_32. Even thought I has a server VM of jdk17, it only uses c1 to
compile methods. Could you tell me what it is?

Thanks,
--lx