[crac] RFR: RCU Lock - RW lock with very lightweight read- and heavyweight write-locking [v5]
Radim Vansa
duke at openjdk.org
Thu Apr 13 15:01:30 UTC 2023
On Wed, 12 Apr 2023 15:02:05 GMT, Radim Vansa <duke at openjdk.org> wrote:
>> This implementation is suitable for uses where the write-locking happens very rarely (if at all), as in the case of CRaC checkpoint, and we don't want to slow down regular access to the protected resource.
>
> Radim Vansa has updated the pull request incrementally with one additional commit since the last revision:
>
> Add synchronized context
It follows the same principle but its use is not interchangeable (and cannot be made so) - had you replaced existing place that uses `ReadWriteLock` with this one it wouldn't work.
I already made a benchmark this includes a noop baseline (`unsync`), and executes the `quick` method as fast as it can in 8 threads, and `slow` method with 10/100 ms think time (single thread):
Benchmark (impl) (pause) Mode Cnt Score Error Units
SwitchPointBenchmark.g:quick unsync 10 thrpt 5 2420777624.146 ± 249641306.573 ops/s
SwitchPointBenchmark.g:slow unsync 10 thrpt 5 99.248 ± 0.264 ops/s
SwitchPointBenchmark.g:quick unsync 100 thrpt 5 2244724220.494 ± 328435039.061 ops/s
SwitchPointBenchmark.g:slow unsync 100 thrpt 5 9.992 ± 0.002 ops/s
SwitchPointBenchmark.g:quick rwlock 10 thrpt 5 4414608.947 ± 1525681.326 ops/s
SwitchPointBenchmark.g:slow rwlock 10 thrpt 5 99.191 ± 0.160 ops/s
SwitchPointBenchmark.g:quick rwlock 100 thrpt 5 4541641.249 ± 3166622.432 ops/s
SwitchPointBenchmark.g:slow rwlock 100 thrpt 5 9.989 ± 0.003 ops/s
SwitchPointBenchmark.g:quick rculock 10 thrpt 5 196537498.940 ± 305743615.522 ops/s
SwitchPointBenchmark.g:slow rculock 10 thrpt 5 94.168 ± 2.159 ops/s
SwitchPointBenchmark.g:quick rculock 100 thrpt 5 772304327.917 ± 28329265.290 ops/s
SwitchPointBenchmark.g:slow rculock 100 thrpt 5 9.909 ± 0.025 ops/s
In case of 10 ms think time (which is really extremely often) results show more than 20x speedup compared to ReentrantReadWriteLock.readLock().lock()+unlock() combo, and just 10x slowdown vs. noop. With 100 ms think time it's order of magnitude better, > 150x speedup vs. < 3x slowdown.
I've also run benchmark with no pause time to see the maximum frequency of synchronization, and it shows about 4.5k syncs/s (it would be less with more threads and longer stacks for sure).
Benchmark (impl) (pause) Mode Cnt Score Error Units
SwitchPointBenchmark.g:quick rculock 0 thrpt 5 1417151.441 ± 72183.322 ops/s
SwitchPointBenchmark.g:slow rculock 0 thrpt 5 4486.629 ± 201.970 ops/s
Note that these results use single fork VM and just few short iterations, but it gives some idea about the order of magnitude.
-------------
PR Comment: https://git.openjdk.org/crac/pull/58#issuecomment-1507121658
More information about the crac-dev
mailing list