RFR: Common TLS access to GC state, where possible
Roman Kennke
rkennke at redhat.com
Mon Jan 15 14:07:51 UTC 2018
Am 15.01.2018 um 14:38 schrieb Aleksey Shipilev:
> On 01/15/2018 01:27 PM, Roman Kennke wrote:
>> Am 15.01.2018 um 13:23 schrieb Aleksey Shipilev:
>> I tried the initial Roland patch with traversal GC (against the then evac-in-progress flag), and
>> have seen occurances of back-to-back evac-loads-checks that have not been common-ed. Roland is
>> looking at it. I suggest to at least hold it back until this is resolved or confirmed to be a
>> separate issue.
>
> This is a separate issue, having nothing to do with barrier moves. This is about commoning the TLS
> access, so that this:
>
> testb $0x2, 0x3d8(TLS)
> jne SLOW
> ...
> testb $0x2, 0x3d8(TLS)
> jne SLOW
> ...
>
> becomes:
>
> mov %r11, 0x3d8(TLS)
> and $0x2, %r11
> test %r11, %r11
> jne SLOW
> ...
> test %r11, %r11
> jne SLOW
> ...
>
> ...saving the TLS access on back-to-back barriers, which are dormant anyhow.
Yes, this is what I was talking about, and I have still seen exactly
those patterns after Roland's patch (at least for some cases).
>> Also, I am not sure if the patch already does it: what about also moving up the actual tests? And
>> thus creating longer paths with/without barriers? I suspect it would be slightly trickier now
>> because of the different masks that it needs to check? It might not be very useful with default
>> heuristics because we tend to interleave different barriers (SATB vs. evac), but may be
>> tremendously useful for traversal GC, where we only have one phase and can thus group all the
>> barriers into one path (enqueue, WBs, *hopefully* even RBs and acmp barriers), and remain
>> barrier-free in another?
>
> Let's have some perspective, and not put all our eggs in one basket, okay? This patch helps the
> cases where (multiple) barriers cannot be optimized. It does not move the barriers around --
> instead, it makes their fastpaths faster by not accessing the TLS every time.
>
> The whole machinery actually helps both SATB and WB checks, because after recent GC state both SATB
> and WB are checking against the same flag. It also aids future work, because it brings forward the
> matchers for generic GC state loads, not only evac-in-progress loads. If you want to have the
> barrier-free paths, you have to care about the generic GC state, not just evac-in-progress.
>
> Please note the optimization is disabled by default, but we want the C2 scaffolding anyway.
Ok. This is not so separate though. What I was suggesting in this last
comment was to also common the actual checks, so your above example
could become (assuming same flags):
mov %r11, 0x3d8(TLS)
and $0x2, %r11
test %r11, %r11
jne SLOW
...
I am not (yet) suggesting to move any barriers around. All I care about
for now is commoning the loads, and when that works, also commoning the
tests. This alone should lead to nice groups of barriers under one
flag-load-test, and a fast path without barriers. Or not?
Roman
More information about the shenandoah-dev
mailing list