[8] RFR: Shenandoah: specialize String Table scans for better pause performance
Aleksey Shipilev
shade at redhat.com
Tue May 19 09:23:24 UTC 2020
This is sh/jdk8 specific fix. String Table scans dominate some workloads, notably Cassandra. But
they are also visible on simple hello world scenarios. The underlying reason is that parallel
StringTable scans use a very small chunk size, so on decently-sized tables we hit the contended
atomic very often. Larger chunk size is needed to improve performance.
Since that chunk size is hard-coded and entrenched in the 8u upstream code, I chose to copy-paste
the relevant method and specialize it for Shenandoah. 11u and later code handles it better with
OopStorage and concurrent hash tables, so this specialization is only needed for 8u.
8u webrev:
https://cr.openjdk.java.net/~shade/shenandoah/8u-stringtable-scan/webrev.01/
This yields the impressive improvements on Cassandra perf tests, see example GC cycles:
=== Baseline:
Pause Init Mark (N) 3646 us
Accumulate Stats 18 us
Make Parsable 4 us
Update Region States 86 us
Scan Roots 3478 us
S: <total> 26681 us
S: Thread Roots 1521 us, workers (us): 468, 392, 111, 122, 113, 106, 111, 98,
S: Universe Roots 3 us, workers (us): 3, ---, ---, ---, ---, ---, ---, ---,
S: JNI Handles Roots 3 us, workers (us): 3, ---, ---, ---, ---, ---, ---, ---,
S: JNI Weak Roots 25 us, workers (us): ---, ---, ---, 25, ---, ---, ---, ---,
S: String Table Roots 24644 us, workers (us): 2857, 2935, 3141, 3122, 3148, 3148, 3148, 3145,
S: Synchronizer Roots 3 us, workers (us): 2, 0, 0, 0, 0, 0, 0, 0,
S: Flat Profiler Roots 2 us, workers (us): ---, ---, 2, ---, ---, ---, ---, ---,
S: Management Roots 2 us, workers (us): 2, ---, ---, ---, ---, ---, ---, ---,
S: System Dict Roots 10 us, workers (us): ---, ---, 10, ---, ---, ---, ---, ---,
S: CLDG Roots 458 us, workers (us): 12, 13, 75, 72, 74, 75, 65, 72,
S: JVMTI Roots 10 us, workers (us): ---, 10, ---, ---, ---, ---, ---, ---,
Resize TLABs 11 us
=== Patched:
Pause Init Mark (N) 999 us
Accumulate Stats 17 us
Make Parsable 4 us
Update Region States 92 us
Scan Roots 824 us
S: <total> 5494 us
S: Thread Roots 1584 us, workers (us): 121, 613, 114, 131, 301, 110, 102, 93,
S: Universe Roots 2 us, workers (us): 2, ---, ---, ---, ---, ---, ---, ---,
S: JNI Handles Roots 4 us, workers (us): ---, 4, ---, ---, ---, ---, ---, ---,
S: JNI Weak Roots 20 us, workers (us): ---, ---, ---, ---, ---, 20, ---, ---,
S: String Table Roots 3374 us, workers (us): 506, 81, 479, 487, 372, 482, 473, 494,
S: Synchronizer Roots 4 us, workers (us): 0, 0, 3, 0, 0, 0, 0, 0,
S: Flat Profiler Roots 5 us, workers (us): ---, 5, ---, ---, ---, ---, ---, ---,
S: Management Roots 1 us, workers (us): 1, ---, ---, ---, ---, ---, ---, ---,
S: System Dict Roots 10 us, workers (us): ---, ---, 10, ---, ---, ---, ---, ---,
S: CLDG Roots 478 us, workers (us): 79, 12, 77, 69, 13, 75, 75, 78,
S: JVMTI Roots 10 us, workers (us): 10, ---, ---, ---, ---, ---, ---, ---,
Resize TLABs 10 us
Testing: hotspot_gc_shenandoah, Cassandra benches, eyeballing gc logs
--
Thanks,
-Aleksey
More information about the shenandoah-dev
mailing list