RFR: 8293416: ZGC: Set mark bit with unconditional atomic ops [v3]
hev
duke at openjdk.org
Fri Sep 9 09:52:51 UTC 2022
On Thu, 8 Sep 2022 08:15:28 GMT, hev <duke at openjdk.org> wrote:
>> **Summary**
>> Support to set ZGC mark bit with unconditional atomic ops.
>>
>> **Motivation**
>> ZGC currently modify mark-bitmap by a conditional atomic operation (cmpxchg). This way is not optimal, which will retry the loop when cmpxchg fails.
>>
>> **Description**
>> First, This patch-set add an new unconditional atomic operation: Atomic::fetch_and_or, which is implemented in different ways for different CPU architectures:
>>
>> * Exclusive access: Non-nested loop
>>
>>
>> retry:
>> ll old_val, addr
>> or new_val, old_val, set_val
>> sc new_val, addr
>> beq retry
>>
>>
>> * Atomic access: One instruction
>>
>>
>> ldset old_val, set_val, addr
>>
>>
>> * Generic: Fallback to cmpxchg or use c++ __atomic_fetch_or
>>
>> **Testing**
>> * jtreg tests
>> * benchmark tests
>
> hev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>
> - ZGC: Set mark bit with unconditional atomic ops
> - BitMap: Set bit with unconditional atomic ops
> - Atomic: Add bitset functions
## Micro Benchmark
Benchmark: https://gist.github.com/heiher/6137c8df7038af1a4186994894f2eb2b
### How to run
gcc -O3 -o bench atomic-flip-bit-bench.c -pthread
perf stat ./bench cas
perf stat ./bench amo
### Results
Real time elapsed in seconds (Less is better)
| CPU/Threads/Mode | Score 1 | Score 2 | Score 3 | Score 4 | Score 5 | Score 6 |
|--------|--------|--------|--------|--------|--------|--------|
| Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz |
| 8-Thread CAS | 6.20 | 5.26 | 4.80 | 5.94 | 6.22 | 6.09 |
| 8-Thread AMO | 1.20 | 1.30 | 1.13 | 1.33 | 1.32 | 1.17 |
| HUAWEI Kunpeng 920 @ 2.60GHz |
| 8-Thread CAS | 13.25 | 14.25 | 12.41 | 14.26 | 13.76 | 13.87 |
| 8-Thread AMO | 4.78 | 4.70 | 5.02 | 5.05 | 5.16 | 4.77 |
| 16-Thread CAS | 32.44 | 41.94 | 39.75 | 39.85 | 34.23 | 36.05 |
| 16-Thread AMO | 9.03 | 8.98 | 9.30 | 9.63 | 9.19 | 9.38 |
-------------
PR: https://git.openjdk.org/jdk/pull/10182
More information about the hotspot-dev
mailing list