RFR: 8273122: ZGC: Load forwarding entries without acquire semantics

Hao Tang github.com+7947546+tanghaoth90 at openjdk.java.net
Mon Aug 30 09:30:40 UTC 2021


On Mon, 30 Aug 2021 09:17:12 GMT, Hao Tang <github.com+7947546+tanghaoth90 at openjdk.org> wrote:

> JDK-8272138 introduced an unbound membar release() between the object copy and the forwarding table insertion.
> 
> Thread A (Relocation):
> copy();
> release();
> cas_forwarding_table();
> cas_self_heal();
> 
> The membar guarantees that the contents of the forwarded object are ready after a forwarding entry is loaded. Since load_object_content(ref) depends on the result of load_forwarding_table(), load_acquire can be safely changed to a simple load.
> 
> Thread B (Remapping/Relocation):
> ref = load_forwarding_table(); // acquire (current version) -> relaxed (our proposal)
> load_object_content(ref);
> 
> Our experiment on heapothesys demonstrates >5% time reduction spent on concurrent mark/relocation on AArch64.
> 
> ---------
> ### Progress
> - [x] Change must not contain extraneous whitespace
> - [x] Commit message must refer to an issue
> - [ ] Change must be properly reviewed
> 
> 
> 
> 
> 
> ### Reviewing
> <details><summary>Using <code>git</code></summary>
> 
> Checkout this PR locally: \
> `$ git fetch https://git.openjdk.java.net/jdk pull/5298/head:pull/5298` \
> `$ git checkout pull/5298`
> 
> Update a local copy of the PR: \
> `$ git checkout pull/5298` \
> `$ git pull https://git.openjdk.java.net/jdk pull/5298/head`
> 
> </details>
> <details><summary>Using Skara CLI tools</summary>
> 
> Checkout this PR locally: \
> `$ git pr checkout 5298`
> 
> View PR using the GUI difftool: \
> `$ git pr show -t 5298`
> 
> </details>
> <details><summary>Using diff file</summary>
> 
> Download this PR as a diff file: \
> <a href="https://git.openjdk.java.net/jdk/pull/5298.diff">https://git.openjdk.java.net/jdk/pull/5298.diff</a>
> 
> </details>

our experiment (gc1.log: the baseline; gc2.log: our proposal)


$grep "00.*Phase: Concurrent Mark           " gc*
gc1.log:[100.399s][info   ][gc,stats    ]       Phase: Concurrent Mark                             789.113 / 789.113     568.992 / 1340.385    568.992 / 1340.385    568.992 / 1340.385    ms
gc1.log:[200.399s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       608.129 / 1340.385    608.129 / 1340.385    608.129 / 1340.385    ms
gc1.log:[300.399s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       658.501 / 1340.385    658.501 / 1340.385    658.501 / 1340.385    ms
gc1.log:[400.399s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       688.056 / 1345.466    688.056 / 1345.466    688.056 / 1345.466    ms
gc1.log:[500.400s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       706.695 / 1345.466    706.695 / 1345.466    706.695 / 1345.466    ms
gc1.log:[600.400s][info   ][gc,stats    ]       Phase: Concurrent Mark                             805.184 / 805.184     740.751 / 1405.568    740.751 / 1405.568    740.751 / 1405.568    ms
gc1.log:[700.399s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       769.357 / 1405.568    736.740 / 1405.568    736.740 / 1405.568    ms
gc1.log:[800.399s][info   ][gc,stats    ]       Phase: Concurrent Mark                            1190.918 / 1190.918    779.011 / 1405.568    737.999 / 1405.568    737.999 / 1405.568    ms
gc1.log:[900.399s][info   ][gc,stats    ]       Phase: Concurrent Mark                             168.838 / 168.838     772.214 / 1460.519    736.305 / 1460.519    736.305 / 1460.519    ms
gc2.log:[100.436s][info   ][gc,stats    ]       Phase: Concurrent Mark                             179.593 / 179.593     529.779 / 1263.909    529.779 / 1263.909    529.779 / 1263.909    ms
gc2.log:[200.436s][info   ][gc,stats    ]       Phase: Concurrent Mark                             154.201 / 154.201     661.044 / 1270.214    661.044 / 1270.214    661.044 / 1270.214    ms
gc2.log:[300.436s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       654.328 / 1270.214    654.328 / 1270.214    654.328 / 1270.214    ms
gc2.log:[400.436s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       708.091 / 1371.311    708.091 / 1371.311    708.091 / 1371.311    ms
gc2.log:[500.436s][info   ][gc,stats    ]       Phase: Concurrent Mark                             133.302 / 133.302     668.686 / 1371.311    668.686 / 1371.311    668.686 / 1371.311    ms
gc2.log:[600.437s][info   ][gc,stats    ]       Phase: Concurrent Mark                             137.578 / 137.578     553.064 / 1371.311    553.064 / 1371.311    553.064 / 1371.311    ms
gc2.log:[700.436s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       611.669 / 1505.184    600.242 / 1505.184    600.242 / 1505.184    ms
gc2.log:[800.436s][info   ][gc,stats    ]       Phase: Concurrent Mark                               0.000 / 0.000       674.987 / 2126.355    671.641 / 2126.355    671.641 / 2126.355    ms
gc2.log:[900.436s][info   ][gc,stats    ]       Phase: Concurrent Mark                            1463.835 / 1463.835    691.205 / 2126.355    680.207 / 2126.355    680.207 / 2126.355    ms



$grep "00.*Phase: Concurrent Relocate           " gc*
gc1.log:[100.399s][info   ][gc,stats    ]       Phase: Concurrent Relocate                         142.406 / 142.406      86.449 / 251.164      86.449 / 251.164      86.449 / 251.164     ms
gc1.log:[200.399s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       103.633 / 251.164     103.633 / 251.164     103.633 / 251.164     ms
gc1.log:[300.399s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       119.149 / 251.164     119.149 / 251.164     119.149 / 251.164     ms
gc1.log:[400.399s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       127.896 / 251.164     127.896 / 251.164     127.896 / 251.164     ms
gc1.log:[500.400s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       131.994 / 251.164     131.994 / 251.164     131.994 / 251.164     ms
gc1.log:[600.400s][info   ][gc,stats    ]       Phase: Concurrent Relocate                         159.275 / 159.275     137.536 / 251.164     137.536 / 251.164     137.536 / 251.164     ms
gc1.log:[700.399s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       150.155 / 226.174     139.784 / 251.164     139.784 / 251.164     ms
gc1.log:[800.399s][info   ][gc,stats    ]       Phase: Concurrent Relocate                         225.072 / 225.072     153.125 / 226.174     141.247 / 251.164     141.247 / 251.164     ms
gc1.log:[900.399s][info   ][gc,stats    ]       Phase: Concurrent Relocate                          54.794 / 54.794      151.922 / 228.903     141.573 / 251.164     141.573 / 251.164     ms
gc2.log:[100.436s][info   ][gc,stats    ]       Phase: Concurrent Relocate                          49.474 / 49.474      100.802 / 258.853     100.802 / 258.853     100.802 / 258.853     ms
gc2.log:[200.436s][info   ][gc,stats    ]       Phase: Concurrent Relocate                          48.347 / 48.347      125.454 / 258.853     125.454 / 258.853     125.454 / 258.853     ms
gc2.log:[300.436s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       129.807 / 258.853     129.807 / 258.853     129.807 / 258.853     ms
gc2.log:[400.436s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       136.403 / 258.853     136.403 / 258.853     136.403 / 258.853     ms
gc2.log:[500.436s][info   ][gc,stats    ]       Phase: Concurrent Relocate                          52.439 / 52.439      130.421 / 258.853     130.421 / 258.853     130.421 / 258.853     ms
gc2.log:[600.437s][info   ][gc,stats    ]       Phase: Concurrent Relocate                          48.181 / 48.181      112.995 / 258.853     112.995 / 258.853     112.995 / 258.853     ms
gc2.log:[700.436s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       124.456 / 240.284     121.156 / 258.853     121.156 / 258.853     ms
gc2.log:[800.436s][info   ][gc,stats    ]       Phase: Concurrent Relocate                           0.000 / 0.000       134.948 / 405.144     132.669 / 405.144     132.669 / 405.144     ms
gc2.log:[900.436s][info   ][gc,stats    ]       Phase: Concurrent Relocate                         229.356 / 229.356     137.447 / 405.144     135.168 / 405.144     135.168 / 405.144     ms

-------------

PR: https://git.openjdk.java.net/jdk/pull/5298



More information about the hotspot-gc-dev mailing list