RFR(M): 8213528: fp registers should not need to be saved around a CallLeafNoFP
Wilkinson, Hugh
hugh.wilkinson at intel.com
Thu Nov 15 21:01:43 UTC 2018
The expansion of loadBarrierSlow* was optimized for size. Increases in code size resulted in lower performance, even when the slow path code was not being executed. This was likely because of reduced inlining. The loadBarrierSlow* expansion calls a subroutine with a non-standard register save/restore interface. The called subroutine is responsible for saving and restoring the GP registers.
The cost of saving and restoring all of the vector registers on every load barrier slow path appeared to be prohibitive. For now, that responsibility remains with the C2 compiler, at the expense of code size. The C2 compiler will save and restore only those vector registers that should remain live.
Hugh
-----Original Message-----
From: zgc-dev [mailto:zgc-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Ivanov
Sent: Thursday, November 15, 2018 3:30 PM
To: Roman Kennke <rkennke at redhat.com>; Vladimir Kozlov <vladimir.kozlov at oracle.com>; Roland Westrelin <rwestrel at redhat.com>; hotspot-compiler-dev at openjdk.java.net; zgc-dev at openjdk.java.net
Subject: Re: RFR(M): 8213528: fp registers should not need to be saved around a CallLeafNoFP
On 15/11/2018 12:09, Roman Kennke wrote:
> Including zgc-dev because I believe the discussion may be relevant
> there too. Looking at the .ad file, I see Z barrier decls like this:
>
> https://paste.fedoraproject.org/paste/niLRXUpUj81n8ML9MSM0yg
>
> The only purpose of this exercise seems to be to tell C2 that this
> stuff
> (may) kill the xmm registers, and then call out to runtime. What
> Roland proposed seems an easier way? I.e. instead of generating all
> that stuff, emit a CallLeaf to call out to runtime directly?
>
> Maybe even do what Shenandoah does and call with CallLeafNoFP to a
> stub, which in turn would care to save/restore all those registers.
> This way, the Z barriers wouldn't inhibit XMM spilling.
It looks like loadBarrierSlowRegXmmAndYmm does slightly different thing:
it kills XMM regs, but leaves GP registers intact. The stub then needs to care only about GP registers if it decides to call into the VM.
CallLeaf/CallLeafNoFP obey platform ABI (modulo FP registers) and hence split GP registers into caller-/callee-saved classes.
Best regards,
Vladimir Ivanov
>
> It looks to me like ZGC could live without any of those .ad declarations.
>
> Maybe I am missing something though.
>
> Roman
>
>> Yes, like this.
>>
>> callnode.cpp - add space after node's name output. May be print
>> preserves_fp_registers before name as in machnode.cpp for consistency.
>>
>> Thanks,
>> Vladimir
>>
>> On 11/15/18 2:30 AM, Roland Westrelin wrote:
>>>
>>>> Sounds good. Is lcm.cpp only place where we do such check (in
>>>> addition to code in .ad files)?
>>>
>>> What about this?
>>>
>>> http://cr.openjdk.java.net/~roland/8213528/webrev.01/
>>>
>>> Roland.
>>>
>
More information about the hotspot-compiler-dev
mailing list