RFR: 8293416: ZGC: Set mark bit with unconditional atomic ops [v3]

hev duke at openjdk.org
Fri Sep 9 09:52:51 UTC 2022


On Thu, 8 Sep 2022 08:15:28 GMT, hev <duke at openjdk.org> wrote:

>> **Summary**
>> Support to set ZGC mark bit with unconditional atomic ops.
>> 
>> **Motivation**
>> ZGC currently modify mark-bitmap by a conditional atomic operation (cmpxchg). This way is not optimal, which will retry the loop when cmpxchg fails.
>> 
>> **Description**
>> First, This patch-set add an new unconditional atomic operation: Atomic::fetch_and_or, which is implemented in different ways for different CPU architectures:
>> 
>> * Exclusive access: Non-nested loop
>> 
>> 
>> retry:
>>   ll old_val, addr
>>   or new_val, old_val, set_val
>>   sc new_val, addr
>>   beq retry
>> 
>> 
>> * Atomic access: One instruction
>> 
>> 
>> ldset old_val, set_val, addr
>> 
>> 
>> * Generic: Fallback to cmpxchg or use c++ __atomic_fetch_or
>> 
>> **Testing**
>> * jtreg tests
>> * benchmark tests
>
> hev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
> 
>  - ZGC: Set mark bit with unconditional atomic ops
>  - BitMap: Set bit with unconditional atomic ops
>  - Atomic: Add bitset functions

## Micro Benchmark

Benchmark: https://gist.github.com/heiher/6137c8df7038af1a4186994894f2eb2b

### How to run


gcc -O3 -o bench atomic-flip-bit-bench.c -pthread
perf stat ./bench cas
perf stat ./bench amo

### Results

Real time elapsed in seconds (Less is better)

| CPU/Threads/Mode | Score 1 | Score 2 | Score 3 | Score 4 | Score 5 | Score 6 |
|--------|--------|--------|--------|--------|--------|--------|
| Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz |
| 8-Thread CAS | 6.20 | 5.26 | 4.80 | 5.94 | 6.22 | 6.09 |
| 8-Thread AMO | 1.20 | 1.30 | 1.13 | 1.33 | 1.32 | 1.17 |
| HUAWEI Kunpeng 920 @ 2.60GHz |
| 8-Thread CAS | 13.25 | 14.25 | 12.41 | 14.26 | 13.76 | 13.87 |
| 8-Thread AMO | 4.78 | 4.70 | 5.02 | 5.05 | 5.16 | 4.77 |
| 16-Thread CAS | 32.44 | 41.94 | 39.75 | 39.85 | 34.23 | 36.05 |
| 16-Thread AMO | 9.03 | 8.98 | 9.30 | 9.63 | 9.19 | 9.38 |

-------------

PR: https://git.openjdk.org/jdk/pull/10182


More information about the hotspot-dev mailing list