[EXTERNAL] [UNVERIFIED SENDER] Re: Does UseFPUForSpilling intend to spill a GPR to XMM?

Nils Eliasson nils.eliasson at oracle.com
Wed Oct 20 09:30:59 UTC 2021



On 2021-10-20 10:48, Nils Eliasson wrote:
> Hi Xin,
>
> On 2021-10-20 05:38, Liu, Xin wrote:
>> hi, Nils,
>>
>> Thanks for explanation. Sorry I have too many questions. let me focus on
>> XMM registers. Yes, we have a reproducible from customer, but it's a GUI
>> application. I am still trying to reduce it to a single test.
>>
>> About the the crash, I've filed a JBS issue JDK-8275565 with a
>> description why XMM0 may be clobbered on x86_32. I take a closer look at
>> push_FPU_state() today.
>>
>> void MacroAssembler::push_FPU_state() {
>>    subptr(rsp, FPUStateSizeInWords * wordSize);
>> #ifndef _LP64
>>    fnsave(Address(rsp, 0));
>>    fwait();
>> #else
>>    fxsave(Address(rsp, 0));
>> #endif // LP64
>> }
>>
>> On x86 system, it doesn't save XMM registers because of fnsave. I see
>> that you save them in RegisterSaver::save_live_registers() using extra
>> steps. However, generate_d2i_wrapper() only uses push/pop_FPU_state().
>> Am I right here?

The generate_d2i_wrapper is only used on x86, and only for d2i an d2l. 
They are called as "leafs" - so there will be no safepoint. In that case 
xmm registers can only be clobbered if d2i or d2l is using xmm 
registers. They are native functions - so they might have been compiled 
with different compilers for each release. I suggest you disassemble 
them and look for xmm usage.

Regards,
Nils


> It looks like that's the case - yes.
>
> 32 bit x86 isn't an active platform from my perspective - so I don't 
> know if this is an old or a new problem.
>
> Regards,
> Nils Eliasson
>
>
>>
>>
>> thanks,
>> --lx
>>
>>
>>
>> On 10/19/21 12:52 AM, Nils Eliasson wrote:
>>> CAUTION: This email originated from outside of the organization. Do 
>>> not click links or open attachments unless you can confirm the 
>>> sender and know the content is safe.
>>>
>>>
>>>
>>> Hi Liu,
>>>
>>> You had a lot of questions - I'll try to answer a few of them:
>>>
>>> Yes, UseFPUForSpilling use XMM registers in the C2 compiler. On 64 bit
>>> x86, SSE2 is the minimum requirement. x87 has never been used for 
>>> spilling.
>>>
>>> C2 should work fine on 32 bit x86. Have a look at
>>> "os::is_server_class_machine()" - if the machine you are running on
>>> doesn't meet some criteria - a quick-only-mode (c1) will be used. There
>>> are a flag - "NeverActAsServerClassMachine" - you can use two control
>>> this behavior.
>>>
>>> C2 handles the spilling to XMM register as a part of normal register
>>> allocation - so any clobbering should be handled. I don't recall the
>>> windows 32-bit calling convention - I need to refresh my memory on 
>>> that.
>>> Can you reproduce a failure?
>>>
>>> Regards,
>>> Nils Eliasson
>>>
>>>
>>> On 2021-10-19 03:04, Liu, Xin wrote:
>>>> Hello, Experts,
>>>>
>>>> We recently encounter an ABI issue of XMM0. Even though it only 
>>>> happens
>>>> on jdk8u windows x86(32bits) so far, it raises my concern about
>>>> 'UseFPUForSpilling' for both x86 and x86_64. Does UseFPUForSpilling
>>>> intend to spills GPR to XMM registers? I come from JDK-6978249, but I
>>>> can't the original webrev.
>>>>
>>>> I don't think XMM registers are saved across function calls in any 
>>>> ABIs.
>>>> Only XMM6-XMM15 are saved by the callee on Microsoft platforms.
>>>> https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160#callercallee-saved-registers 
>>>>
>>>>
>>>> If nobody saves XMM0~3, how come C2 register allocation uses them as
>>>> spilling destination? It seems possible on AMD64 as well, but it's 
>>>> rarer
>>>> than x86 given the fact AMD64 has more GPRs.
>>>> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1388 
>>>>
>>>>
>>>> eg. This is what I have seen on Windows x86.
>>>> 07c   B10: #  B26 B11 <- B9  Freq: 0.999992
>>>> 07c           movdl   XMM0, EBX       # spill (xliu: EBX store an OOP)
>>>> 080           MOV    [ESP + #20],EDI
>>>> 084           MOV    EBX,[ECX + #136] ! Field:
>>>> java/awt/Component.componentOrientation
>>>> 08a           TEST   EBX,EBX
>>>> 08c           Je    B26  P=0.000001 C=-1.000000
>>>>
>>>> ----
>>>>
>>>> 170   B16: #  B40 B17 <- B15 B24  Freq: 0.999991
>>>> 170           movdl   EBX, XMM0       # spill
>>>> 174           MOV    EBX,[EBX + #56] ! Field: javax/swing/plaf/basic
>>>> /BasicSliderUI.thumbRect (xliu: segment fault here EBX=0)
>>>> 177           MOV    EDI,[EBX + #16]  # int ! Field: 
>>>> java/awt/Rectangle.width
>>>> 17a           NullCheck EBX
>>>>
>>>> Between BB10 and BB16, the control goes to convD2I_reg_reg, which 
>>>> calls
>>>> SharedRuntime::d2l in the slow path. XMM0 is used as return value, 
>>>> so it
>>>> is clobbered.
>>>>
>>>> So the FPU of 'UseFPUForSpilling' doesn't just refer to intel x87 but
>>>> also include SSE/AVX units, right? If it does intend to use X/Y/ZMM
>>>> registers as Spilling destination, is there any mechanism to protect
>>>> them from runtime calls on x86/x86_64? In both System V and Microsoft
>>>> ABIs, XMM0~3 could be used for both argument passing and return value,
>>>> right?
>>>>
>>>> I see other runtime code stubs such as arraycopy and crc32 use XMM
>>>> registers.
>>>>
>>>> Btw, there's a hidden mechanism somewhere which prevents c2 from 
>>>> working
>>>> on x86_32. Even thought I has a server VM of jdk17, it only uses c1 to
>>>> compile methods. Could you tell me what it is?
>>>>
>>>> Thanks,
>>>> --lx
>>>>
>



More information about the hotspot-compiler-dev mailing list