[EXTERNAL] [UNVERIFIED SENDER] Re: Does UseFPUForSpilling intend to spill a GPR to XMM?

Nils Eliasson nils.eliasson at oracle.com
Wed Oct 20 08:48:27 UTC 2021


Hi Xin,

On 2021-10-20 05:38, Liu, Xin wrote:
> hi, Nils,
>
> Thanks for explanation. Sorry I have too many questions. let me focus on
> XMM registers. Yes, we have a reproducible from customer, but it's a GUI
> application. I am still trying to reduce it to a single test.
>
> About the the crash, I've filed a JBS issue JDK-8275565 with a
> description why XMM0 may be clobbered on x86_32. I take a closer look at
> push_FPU_state() today.
>
> void MacroAssembler::push_FPU_state() {
>    subptr(rsp, FPUStateSizeInWords * wordSize);
> #ifndef _LP64
>    fnsave(Address(rsp, 0));
>    fwait();
> #else
>    fxsave(Address(rsp, 0));
> #endif // LP64
> }
>
> On x86 system, it doesn't save XMM registers because of fnsave. I see
> that you save them in RegisterSaver::save_live_registers() using extra
> steps. However, generate_d2i_wrapper() only uses push/pop_FPU_state().
> Am I right here?
It looks like that's the case - yes.

32 bit x86 isn't an active platform from my perspective - so I don't 
know if this is an old or a new problem.

Regards,
Nils Eliasson


>
>
> thanks,
> --lx
>
>
>
> On 10/19/21 12:52 AM, Nils Eliasson wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Hi Liu,
>>
>> You had a lot of questions - I'll try to answer a few of them:
>>
>> Yes, UseFPUForSpilling use XMM registers in the C2 compiler. On 64 bit
>> x86, SSE2 is the minimum requirement. x87 has never been used for spilling.
>>
>> C2 should work fine on 32 bit x86. Have a look at
>> "os::is_server_class_machine()" - if the machine you are running on
>> doesn't meet some criteria - a quick-only-mode (c1) will be used. There
>> are a flag - "NeverActAsServerClassMachine" - you can use two control
>> this behavior.
>>
>> C2 handles the spilling to XMM register as a part of normal register
>> allocation - so any clobbering should be handled. I don't recall the
>> windows 32-bit calling convention - I need to refresh my memory on that.
>> Can you reproduce a failure?
>>
>> Regards,
>> Nils Eliasson
>>
>>
>> On 2021-10-19 03:04, Liu, Xin wrote:
>>> Hello, Experts,
>>>
>>> We recently encounter an ABI issue of XMM0. Even though it only happens
>>> on jdk8u windows x86(32bits) so far, it raises my concern about
>>> 'UseFPUForSpilling' for both x86 and x86_64. Does UseFPUForSpilling
>>> intend to spills GPR to XMM registers? I come from JDK-6978249, but I
>>> can't the original webrev.
>>>
>>> I don't think XMM registers are saved across function calls in any ABIs.
>>> Only XMM6-XMM15 are saved by the callee on Microsoft platforms.
>>> https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160#callercallee-saved-registers
>>>
>>> If nobody saves XMM0~3, how come C2 register allocation uses them as
>>> spilling destination? It seems possible on AMD64 as well, but it's rarer
>>> than x86 given the fact AMD64 has more GPRs.
>>> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1388
>>>
>>> eg. This is what I have seen on Windows x86.
>>> 07c   B10: #  B26 B11 <- B9  Freq: 0.999992
>>> 07c           movdl   XMM0, EBX       # spill (xliu: EBX store an OOP)
>>> 080           MOV    [ESP + #20],EDI
>>> 084           MOV    EBX,[ECX + #136] ! Field:
>>> java/awt/Component.componentOrientation
>>> 08a           TEST   EBX,EBX
>>> 08c           Je    B26  P=0.000001 C=-1.000000
>>>
>>> ----
>>>
>>> 170   B16: #  B40 B17 <- B15 B24  Freq: 0.999991
>>> 170           movdl   EBX, XMM0       # spill
>>> 174           MOV    EBX,[EBX + #56] ! Field: javax/swing/plaf/basic
>>> /BasicSliderUI.thumbRect (xliu: segment fault here EBX=0)
>>> 177           MOV    EDI,[EBX + #16]  # int ! Field: java/awt/Rectangle.width
>>> 17a           NullCheck EBX
>>>
>>> Between BB10 and BB16, the control goes to convD2I_reg_reg, which calls
>>> SharedRuntime::d2l in the slow path. XMM0 is used as return value, so it
>>> is clobbered.
>>>
>>> So the FPU of 'UseFPUForSpilling' doesn't just refer to intel x87 but
>>> also include SSE/AVX units, right? If it does intend to use X/Y/ZMM
>>> registers as Spilling destination, is there any mechanism to protect
>>> them from runtime calls on x86/x86_64? In both System V and Microsoft
>>> ABIs, XMM0~3 could be used for both argument passing and return value,
>>> right?
>>>
>>> I see other runtime code stubs such as arraycopy and crc32 use XMM
>>> registers.
>>>
>>> Btw, there's a hidden mechanism somewhere which prevents c2 from working
>>> on x86_32. Even thought I has a server VM of jdk17, it only uses c1 to
>>> compile methods. Could you tell me what it is?
>>>
>>> Thanks,
>>> --lx
>>>



More information about the hotspot-compiler-dev mailing list