[UNVERIFIED SENDER] Re: Does UseFPUForSpilling intend to spill a GPR to XMM?

Liu, Xin xxinliu at amazon.com
Wed Oct 20 03:38:59 UTC 2021


hi, Nils,

Thanks for explanation. Sorry I have too many questions. let me focus on
XMM registers. Yes, we have a reproducible from customer, but it's a GUI
application. I am still trying to reduce it to a single test.

About the the crash, I've filed a JBS issue JDK-8275565 with a
description why XMM0 may be clobbered on x86_32. I take a closer look at
push_FPU_state() today.

void MacroAssembler::push_FPU_state() {
  subptr(rsp, FPUStateSizeInWords * wordSize);
#ifndef _LP64
  fnsave(Address(rsp, 0));
  fwait();
#else
  fxsave(Address(rsp, 0));
#endif // LP64
}

On x86 system, it doesn't save XMM registers because of fnsave. I see
that you save them in RegisterSaver::save_live_registers() using extra
steps. However, generate_d2i_wrapper() only uses push/pop_FPU_state().
Am I right here?


thanks,
--lx



On 10/19/21 12:52 AM, Nils Eliasson wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> Hi Liu,
> 
> You had a lot of questions - I'll try to answer a few of them:
> 
> Yes, UseFPUForSpilling use XMM registers in the C2 compiler. On 64 bit
> x86, SSE2 is the minimum requirement. x87 has never been used for spilling.
> 
> C2 should work fine on 32 bit x86. Have a look at
> "os::is_server_class_machine()" - if the machine you are running on
> doesn't meet some criteria - a quick-only-mode (c1) will be used. There
> are a flag - "NeverActAsServerClassMachine" - you can use two control
> this behavior.
> 
> C2 handles the spilling to XMM register as a part of normal register
> allocation - so any clobbering should be handled. I don't recall the
> windows 32-bit calling convention - I need to refresh my memory on that.
> Can you reproduce a failure?
> 
> Regards,
> Nils Eliasson
> 
> 
> On 2021-10-19 03:04, Liu, Xin wrote:
>> Hello, Experts,
>>
>> We recently encounter an ABI issue of XMM0. Even though it only happens
>> on jdk8u windows x86(32bits) so far, it raises my concern about
>> 'UseFPUForSpilling' for both x86 and x86_64. Does UseFPUForSpilling
>> intend to spills GPR to XMM registers? I come from JDK-6978249, but I
>> can't the original webrev.
>>
>> I don't think XMM registers are saved across function calls in any ABIs.
>> Only XMM6-XMM15 are saved by the callee on Microsoft platforms.
>> https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-160#callercallee-saved-registers
>>
>> If nobody saves XMM0~3, how come C2 register allocation uses them as
>> spilling destination? It seems possible on AMD64 as well, but it's rarer
>> than x86 given the fact AMD64 has more GPRs.
>> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86_64.ad#L1388
>>
>> eg. This is what I have seen on Windows x86.
>> 07c   B10: #  B26 B11 <- B9  Freq: 0.999992
>> 07c           movdl   XMM0, EBX       # spill (xliu: EBX store an OOP)
>> 080           MOV    [ESP + #20],EDI
>> 084           MOV    EBX,[ECX + #136] ! Field:
>> java/awt/Component.componentOrientation
>> 08a           TEST   EBX,EBX
>> 08c           Je    B26  P=0.000001 C=-1.000000
>>
>> ----
>>
>> 170   B16: #  B40 B17 <- B15 B24  Freq: 0.999991
>> 170           movdl   EBX, XMM0       # spill
>> 174           MOV    EBX,[EBX + #56] ! Field: javax/swing/plaf/basic
>> /BasicSliderUI.thumbRect (xliu: segment fault here EBX=0)
>> 177           MOV    EDI,[EBX + #16]  # int ! Field: java/awt/Rectangle.width
>> 17a           NullCheck EBX
>>
>> Between BB10 and BB16, the control goes to convD2I_reg_reg, which calls
>> SharedRuntime::d2l in the slow path. XMM0 is used as return value, so it
>> is clobbered.
>>
>> So the FPU of 'UseFPUForSpilling' doesn't just refer to intel x87 but
>> also include SSE/AVX units, right? If it does intend to use X/Y/ZMM
>> registers as Spilling destination, is there any mechanism to protect
>> them from runtime calls on x86/x86_64? In both System V and Microsoft
>> ABIs, XMM0~3 could be used for both argument passing and return value,
>> right?
>>
>> I see other runtime code stubs such as arraycopy and crc32 use XMM
>> registers.
>>
>> Btw, there's a hidden mechanism somewhere which prevents c2 from working
>> on x86_32. Even thought I has a server VM of jdk17, it only uses c1 to
>> compile methods. Could you tell me what it is?
>>
>> Thanks,
>> --lx
>>
> 


More information about the hotspot-compiler-dev mailing list