RFR: 8273122: ZGC: Load forwarding entries without acquire semantics
Hao Tang
github.com+7947546+tanghaoth90 at openjdk.java.net
Mon Aug 30 09:30:40 UTC 2021
On Mon, 30 Aug 2021 09:17:12 GMT, Hao Tang <github.com+7947546+tanghaoth90 at openjdk.org> wrote:
> JDK-8272138 introduced an unbound membar release() between the object copy and the forwarding table insertion.
>
> Thread A (Relocation):
> copy();
> release();
> cas_forwarding_table();
> cas_self_heal();
>
> The membar guarantees that the contents of the forwarded object are ready after a forwarding entry is loaded. Since load_object_content(ref) depends on the result of load_forwarding_table(), load_acquire can be safely changed to a simple load.
>
> Thread B (Remapping/Relocation):
> ref = load_forwarding_table(); // acquire (current version) -> relaxed (our proposal)
> load_object_content(ref);
>
> Our experiment on heapothesys demonstrates >5% time reduction spent on concurrent mark/relocation on AArch64.
>
> ---------
> ### Progress
> - [x] Change must not contain extraneous whitespace
> - [x] Commit message must refer to an issue
> - [ ] Change must be properly reviewed
>
>
>
>
>
> ### Reviewing
> <details><summary>Using <code>git</code></summary>
>
> Checkout this PR locally: \
> `$ git fetch https://git.openjdk.java.net/jdk pull/5298/head:pull/5298` \
> `$ git checkout pull/5298`
>
> Update a local copy of the PR: \
> `$ git checkout pull/5298` \
> `$ git pull https://git.openjdk.java.net/jdk pull/5298/head`
>
> </details>
> <details><summary>Using Skara CLI tools</summary>
>
> Checkout this PR locally: \
> `$ git pr checkout 5298`
>
> View PR using the GUI difftool: \
> `$ git pr show -t 5298`
>
> </details>
> <details><summary>Using diff file</summary>
>
> Download this PR as a diff file: \
> <a href="https://git.openjdk.java.net/jdk/pull/5298.diff">https://git.openjdk.java.net/jdk/pull/5298.diff</a>
>
> </details>
our experiment (gc1.log: the baseline; gc2.log: our proposal)
$grep "00.*Phase: Concurrent Mark " gc*
gc1.log:[100.399s][info ][gc,stats ] Phase: Concurrent Mark 789.113 / 789.113 568.992 / 1340.385 568.992 / 1340.385 568.992 / 1340.385 ms
gc1.log:[200.399s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 608.129 / 1340.385 608.129 / 1340.385 608.129 / 1340.385 ms
gc1.log:[300.399s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 658.501 / 1340.385 658.501 / 1340.385 658.501 / 1340.385 ms
gc1.log:[400.399s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 688.056 / 1345.466 688.056 / 1345.466 688.056 / 1345.466 ms
gc1.log:[500.400s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 706.695 / 1345.466 706.695 / 1345.466 706.695 / 1345.466 ms
gc1.log:[600.400s][info ][gc,stats ] Phase: Concurrent Mark 805.184 / 805.184 740.751 / 1405.568 740.751 / 1405.568 740.751 / 1405.568 ms
gc1.log:[700.399s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 769.357 / 1405.568 736.740 / 1405.568 736.740 / 1405.568 ms
gc1.log:[800.399s][info ][gc,stats ] Phase: Concurrent Mark 1190.918 / 1190.918 779.011 / 1405.568 737.999 / 1405.568 737.999 / 1405.568 ms
gc1.log:[900.399s][info ][gc,stats ] Phase: Concurrent Mark 168.838 / 168.838 772.214 / 1460.519 736.305 / 1460.519 736.305 / 1460.519 ms
gc2.log:[100.436s][info ][gc,stats ] Phase: Concurrent Mark 179.593 / 179.593 529.779 / 1263.909 529.779 / 1263.909 529.779 / 1263.909 ms
gc2.log:[200.436s][info ][gc,stats ] Phase: Concurrent Mark 154.201 / 154.201 661.044 / 1270.214 661.044 / 1270.214 661.044 / 1270.214 ms
gc2.log:[300.436s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 654.328 / 1270.214 654.328 / 1270.214 654.328 / 1270.214 ms
gc2.log:[400.436s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 708.091 / 1371.311 708.091 / 1371.311 708.091 / 1371.311 ms
gc2.log:[500.436s][info ][gc,stats ] Phase: Concurrent Mark 133.302 / 133.302 668.686 / 1371.311 668.686 / 1371.311 668.686 / 1371.311 ms
gc2.log:[600.437s][info ][gc,stats ] Phase: Concurrent Mark 137.578 / 137.578 553.064 / 1371.311 553.064 / 1371.311 553.064 / 1371.311 ms
gc2.log:[700.436s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 611.669 / 1505.184 600.242 / 1505.184 600.242 / 1505.184 ms
gc2.log:[800.436s][info ][gc,stats ] Phase: Concurrent Mark 0.000 / 0.000 674.987 / 2126.355 671.641 / 2126.355 671.641 / 2126.355 ms
gc2.log:[900.436s][info ][gc,stats ] Phase: Concurrent Mark 1463.835 / 1463.835 691.205 / 2126.355 680.207 / 2126.355 680.207 / 2126.355 ms
$grep "00.*Phase: Concurrent Relocate " gc*
gc1.log:[100.399s][info ][gc,stats ] Phase: Concurrent Relocate 142.406 / 142.406 86.449 / 251.164 86.449 / 251.164 86.449 / 251.164 ms
gc1.log:[200.399s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 103.633 / 251.164 103.633 / 251.164 103.633 / 251.164 ms
gc1.log:[300.399s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 119.149 / 251.164 119.149 / 251.164 119.149 / 251.164 ms
gc1.log:[400.399s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 127.896 / 251.164 127.896 / 251.164 127.896 / 251.164 ms
gc1.log:[500.400s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 131.994 / 251.164 131.994 / 251.164 131.994 / 251.164 ms
gc1.log:[600.400s][info ][gc,stats ] Phase: Concurrent Relocate 159.275 / 159.275 137.536 / 251.164 137.536 / 251.164 137.536 / 251.164 ms
gc1.log:[700.399s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 150.155 / 226.174 139.784 / 251.164 139.784 / 251.164 ms
gc1.log:[800.399s][info ][gc,stats ] Phase: Concurrent Relocate 225.072 / 225.072 153.125 / 226.174 141.247 / 251.164 141.247 / 251.164 ms
gc1.log:[900.399s][info ][gc,stats ] Phase: Concurrent Relocate 54.794 / 54.794 151.922 / 228.903 141.573 / 251.164 141.573 / 251.164 ms
gc2.log:[100.436s][info ][gc,stats ] Phase: Concurrent Relocate 49.474 / 49.474 100.802 / 258.853 100.802 / 258.853 100.802 / 258.853 ms
gc2.log:[200.436s][info ][gc,stats ] Phase: Concurrent Relocate 48.347 / 48.347 125.454 / 258.853 125.454 / 258.853 125.454 / 258.853 ms
gc2.log:[300.436s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 129.807 / 258.853 129.807 / 258.853 129.807 / 258.853 ms
gc2.log:[400.436s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 136.403 / 258.853 136.403 / 258.853 136.403 / 258.853 ms
gc2.log:[500.436s][info ][gc,stats ] Phase: Concurrent Relocate 52.439 / 52.439 130.421 / 258.853 130.421 / 258.853 130.421 / 258.853 ms
gc2.log:[600.437s][info ][gc,stats ] Phase: Concurrent Relocate 48.181 / 48.181 112.995 / 258.853 112.995 / 258.853 112.995 / 258.853 ms
gc2.log:[700.436s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 124.456 / 240.284 121.156 / 258.853 121.156 / 258.853 ms
gc2.log:[800.436s][info ][gc,stats ] Phase: Concurrent Relocate 0.000 / 0.000 134.948 / 405.144 132.669 / 405.144 132.669 / 405.144 ms
gc2.log:[900.436s][info ][gc,stats ] Phase: Concurrent Relocate 229.356 / 229.356 137.447 / 405.144 135.168 / 405.144 135.168 / 405.144 ms
-------------
PR: https://git.openjdk.java.net/jdk/pull/5298
More information about the hotspot-gc-dev
mailing list