RFR: 8273122: ZGC: Load forwarding entries without acquire semantics

Tue Aug 31 09:05:25 UTC 2021

On Mon, 30 Aug 2021 09:17:12 GMT, Hao Tang <github.com+7947546+tanghaoth90 at openjdk.org> wrote:

> JDK-8272138 introduced an unbound membar release() between the object copy and the forwarding table insertion.
> 
> Thread A (Relocation):
> copy();
> release();
> cas_forwarding_table();
> cas_self_heal();
> 
> The membar guarantees that the contents of the forwarded object are ready after a forwarding entry is loaded. Since load_object_content(ref) depends on the result of load_forwarding_table(), load_acquire can be safely changed to a simple load.
> 
> Thread B (Remapping/Relocation):
> ref = load_forwarding_table(); // acquire (current version) -> relaxed (our proposal)
> load_object_content(ref);
> 
> Our experiment on heapothesys demonstrates >5% time reduction spent on concurrent mark/relocation on AArch64.
> 
> ---------
> ### Progress
> - [x] Change must not contain extraneous whitespace
> - [x] Commit message must refer to an issue
> - [ ] Change must be properly reviewed
> 
> 
> 
> 
> 
> ### Reviewing
> <details><summary>Using <code>git</code></summary>
> 
> Checkout this PR locally: \
> `$ git fetch https://git.openjdk.java.net/jdk pull/5298/head:pull/5298` \
> `$ git checkout pull/5298`
> 
> Update a local copy of the PR: \
> `$ git checkout pull/5298` \
> `$ git pull https://git.openjdk.java.net/jdk pull/5298/head`
> 
> </details>
> <details><summary>Using Skara CLI tools</summary>
> 
> Checkout this PR locally: \
> `$ git pr checkout 5298`
> 
> View PR using the GUI difftool: \
> `$ git pr show -t 5298`
> 
> </details>
> <details><summary>Using diff file</summary>
> 
> Download this PR as a diff file: \
> <a href="https://git.openjdk.java.net/jdk/pull/5298.diff">https://git.openjdk.java.net/jdk/pull/5298.diff</a>
> 
> </details>

As far as I know, Hotspot does not consistently use C++ atomics, so whether the compiler implements consume-as-acquire or any other aspect of the C++ memory model not does not really matter. Consequently, the code is full of undefined data races as far as the compiler is concerned. Compiler writers are aware of that and generally avoid optimizations that would break this memory-model-in-a-library approach (although there is of course no language specification that actually guarantees this).

I think that historically, the approach has been that when compiler optimizations make things go wrong, some sort of compiler barrier is added to the code. I assume the same thing could be done to implement an approximation to consume semantics.

-------------

PR: https://git.openjdk.java.net/jdk/pull/5298