RFR: 8272138: ZGC: Adopt release ordering for self-healing
Erik Österlund
eosterlund at openjdk.java.net
Tue Aug 10 09:32:30 UTC 2021
On Tue, 10 Aug 2021 09:13:58 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:
>> A thread that copies the object and self heals will conceptually do the following, assuming relaxed memory ordering:
>>
>> copy();
>> release();
>> cas_forwarding_table();
>> cas_self_heal();
>>
>> The release before casing in the forwading table, acts as a release for both accesses, in the scenario when the copy is being published. So in the scenario you describe, the release in the forwarding table is already enough, to ensure that anyone reading the self healed pointer, is guaranteed to not observe bytes from before the copy. In the scenario when one thread performs the copy that gets published to the forwarding table, and another thread self-heals the pointer with the value acquired from the forwarding table, we will indeed not have a release to publish the pointer, only an acquire used to read from the forwarding table. However, this is fine, as the new MCA ARMv8 architecture does not allow causal consistency violations like WRC (cf. https://dl.acm.org/doi/pdf/10.1145/3158107 section 4). So we no longer need to use acquire/release to guarantee causal consistency across threads. This would naturally not hold for PPC, but there is no PPC port for ZGC yet.
>>
>> It is interesting though that when loading a self-healed pointer, we do not perform any acquire. That is fine when dereferencing the loaded pointer, as a dependent load, eliding the need for an acquire. And that is indeed fine for the JIT compiled code, because we know it is always a dependent load (or safe in other ways). However, for the C++ code, we can not *guarantee* that there will be a dependent load in a spec conforming way. That might be something to look into. In practice, there isn't any good reason why reading and oop and then dereferencing it wouldn't yield a dependent load, but the spec doesn't promise anything and could in theory allow compilers to mess this up. However, having an acquire for every oop load in the runtime does sound a bit costly. The memory_order_consume semantics were supposed to solve this, but I'm not sure if the compilers have yet become good at doing something useful with that, other than just having it be equivalent to acquire. Might be someth
ing to check out in the disassembly to see what it yields. But that is an exercise for another day, as this isn't an issue you are introducing with this patch.
>>
>> Hope this helps explain my thoughts in more detail.
>
>> @fisk Thanks a lot for your detailed explanation. But I’m quite confused about the release in the forwarding table, why is it able to act as a release for both accesses? In my view, since the release is “bonded” to forwarding table, it only ensures that copy happens before installing forwardee, why does it have something to do with self healing? From your explanation, I guess release is not “bonded” to forwarding table. Instead, it maybe serves as membar to block all the CASes afterwards. Thanks again for your patient explanation.
>
> What I wrote kind of assumes that we have an explicit OrderAccess::release(); relaxed_cas(); in the forwarding table. Looks like we currently have a bounded releasing cas instead. The unbounded release is guaranteed to provide release semantics for *any* subsequent store operation. However, a bounded Store-Release, (i.e. stlr) instruction only needs to provide the release semantics to the bound store, as you say. So I suppose what I propose is to use release(); relaxed_cas(); in the forwarding table, and then also relaxed_cas(); in the self healing. Letting the release be done in only one place, and only when actual copying happens. The vast majority of self heals I have found are purely remapping the pointer lazily, which can then dodge any need for release when self healing. Hope this makes sense.
> @fisk Hi, Eric. We are wondering if one thread loading a healed pointer can observe the corresponding copy has not finished yet. Assuming relaxed ordering for `cas_self_heal`, both Thread A and Thread B are loading the same reference.
>
> **Thread A**: `load obj.fld; // will relocate the object referenced by obj.fld`
> thread A will do the following:
>
> ```
> 1 copy();
> 2 cas_forwarding_table(); // release
> 3 cas_self_heal(); // relaxed
> ```
>
> **Thread B**: `load obj.fld; // load the same reference`
> thread B may obverses the following reordering of **thread A**:
>
> ```
> 3 cas_self_heal(); // relaxed
> 1 copy();
> 2 cas_forwarding_table(); // release
> ```
>
> To our knowledge, release ordering in _line 2_ does not prevent _line 3_ to be reordering before _line 1_, which indicates the release in the forwarding table is not enough. Perhaps we need to add acquire ordering to _line 2_ or add release ordering to _line 3_.
>
> In another way, as @weixlu said,
>
> > Instead, it maybe serves as membar to block all the CASes afterwards.
>
> relaxed ordering in _line 2_ along with release ordering in _line 3_ can indeed ensure thread B always observes the object copy.
>
> Looking forward to your advice.
Yeah so I was kind of assuming the forwarding table installation would be an unbound release(); relaxed_cas();
That way, the release serves a dual purpose: releasing for the table and releasing for the self heal. That way the vast majority of self heals (that only do remapping), won't need to release.
-------------
PR: https://git.openjdk.java.net/jdk/pull/5046
More information about the hotspot-gc-dev
mailing list