RFR: 8294775: Shenandoah: reduce contention on _threads_in_evac
Nick Gasson
ngasson at openjdk.org
Tue Oct 11 12:36:31 UTC 2022
On Tue, 11 Oct 2022 12:02:43 GMT, Roman Kennke <rkennke at openjdk.org> wrote:
>> The idea here is to reduce contention on the shared `_threads_in_evac` counter by splitting its state over multiple independent cache lines. Each thread hashes to one particular counter based on its `Thread*`. This helps improve throughput of concurrent evacuation where many Java threads may be attempting to update this counter on the load barrier slow path.
>>
>> See this earlier thread for details and SPECjbb results: https://mail.openjdk.org/pipermail/shenandoah-dev/2022-October/017494.html
>>
>> Also tested `hotspot_gc_shenandoah` on x86 and AArch64.
>
> src/hotspot/share/gc/shenandoah/shenandoahEvacOOMHandler.cpp line 40:
>
>> 38:
>> 39: volatile jint *ShenandoahEvacOOMHandler::threads_in_evac_ptr(Thread* t) {
>> 40: uint64_t key = (uintptr_t)t;
>
> Maybe put that in a separate hash(Thread*) function? Also, is that a particular documented hash-function?(Related: In Lilliput project, I am working on a different identity-hash-code implementation, and part of it will be a hash-implementation to hash arbitrary pointers to 32 or 64 bit hash, currently using murmur3. Maybe this could be reused for here, when it happens?)
It is actually the bit mixing function from MurmurHash3. The particular algorithm doesn't matter too much though - I just couldn't find an existing one in the shared code.
> src/hotspot/share/gc/shenandoah/shenandoahEvacOOMHandler.cpp line 55:
>
>> 53: // *and* the counter is zero.
>> 54: while (Atomic::load_acquire(ptr) != OOM_MARKER_MASK) {
>> 55: os::naked_short_sleep(1);
>
> Not sure if SpinPause() may be better here? @shipilev probably knows more.
I think we'd probably want some back-off here rather than spinning indefinitely? E.g. spin N times and then start sleeping.
> src/hotspot/share/gc/shenandoah/shenandoahEvacOOMHandler.hpp line 88:
>
>> 86: static const jint OOM_MARKER_MASK;
>> 87:
>> 88: static constexpr jint EVAC_COUNTER_BUCKETS = 64;
>
> Maybe it'd be useful to not hardwire this? It could be a runtime option, possibly diagnostic (not sure). Many workloads would not even use so many threads...
If we're going to make it dynamic maybe it should be set to the number of physical CPUs?
-------------
PR: https://git.openjdk.org/jdk/pull/10573
More information about the hotspot-gc-dev
mailing list