Shenandoah WB fastpath and optimizations
Aleksey Shipilev
shade at redhat.com
Tue Dec 19 14:19:33 UTC 2017
On 12/19/2017 03:14 PM, Roland Westrelin wrote:
>> Thoughts?
>
> Could the 18M stores be spills and somewhere in the 77..101M extra loads
> would be their counterpart spill loads? The WB needs at least one extra
> register and there's also the possibility that the WB slow path messes
> up the register allocator heuristics (as we've seen with the XMM
> spills).
Could be, and it was my base theory at some point. But I'd expect more loads to manifest more
reliably. As such, we seem to be very well within the L1-load budget to account for WB loads.
In fact, I wanted to ask you what would it take to teach C2 to emit C1-style check, e.g. instead of:
movzbl 0x3d8(%rTLS), %rScratch ; read evac-in-progress
test %rScratch, %rScratch
jne EVAC-ENABLED-SLOW-PATH
mov -0x8(%rObj), %rObj ; read barrier
...do:
cmpb 0x3d8(%TLS), 0 ; read evac-in-progress
jne EVAC-ENABLED-SLOW-PATH
mov -0x8(%rObj), %rObj ; read barrier
...thus freeing up the register?
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list