[8] RFR: Shenandoah: specialize String Table scans for better pause performance
rkennke at redhat.com
rkennke at redhat.com
Tue May 19 09:36:47 UTC 2020
Looks good to me!
Thanks,
Roman
> This is sh/jdk8 specific fix. String Table scans dominate some
> workloads, notably Cassandra. But
> they are also visible on simple hello world scenarios. The underlying
> reason is that parallel
> StringTable scans use a very small chunk size, so on decently-sized
> tables we hit the contended
> atomic very often. Larger chunk size is needed to improve
> performance.
>
> Since that chunk size is hard-coded and entrenched in the 8u upstream
> code, I chose to copy-paste
> the relevant method and specialize it for Shenandoah. 11u and later
> code handles it better with
> OopStorage and concurrent hash tables, so this specialization is only
> needed for 8u.
>
> 8u webrev:
>
> https://cr.openjdk.java.net/~shade/shenandoah/8u-stringtable-scan/webrev.01/
>
> This yields the impressive improvements on Cassandra perf tests, see
> example GC cycles:
>
> === Baseline:
>
> Pause Init Mark (N) 3646 us
> Accumulate Stats 18 us
> Make Parsable 4 us
> Update Region States 86 us
> Scan Roots 3478 us
> S: <total> 26681 us
> S: Thread Roots 1521 us, workers (us): 468, 392, 111,
> 122, 113, 106, 111, 98,
> S: Universe Roots 3 us, workers (us): 3, ---, ---,
> ---, ---, ---, ---, ---,
> S: JNI Handles Roots 3 us, workers (us): 3, ---, ---,
> ---, ---, ---, ---, ---,
> S: JNI Weak Roots 25 us, workers (us): ---, ---, -
> --, 25, ---, ---, ---, ---,
> S: String Table Roots 24644 us, workers (us): 2857, 2935,
> 3141, 3122, 3148, 3148, 3148, 3145,
> S: Synchronizer Roots 3 us, workers
> (us): 2, 0, 0, 0, 0, 0, 0, 0,
> S: Flat Profiler Roots 2 us, workers (us): ---, ---, 2,
> ---, ---, ---, ---, ---,
> S: Management Roots 2 us, workers (us): 2, ---, ---,
> ---, ---, ---, ---, ---,
> S: System Dict Roots 10 us, workers (us): ---, ---, 10,
> ---, ---, ---, ---, ---,
> S: CLDG Roots 458 us, workers
> (us): 12, 13, 75, 72, 74, 75, 65, 72,
> S: JVMTI Roots 10 us, workers (us): ---, 10, ---,
> ---, ---, ---, ---, ---,
> Resize TLABs 11 us
>
> === Patched:
> Pause Init Mark (N) 999 us
> Accumulate Stats 17 us
> Make Parsable 4 us
> Update Region States 92 us
> Scan Roots 824 us
> S: <total> 5494 us
> S: Thread Roots 1584 us, workers (us): 121, 613, 114,
> 131, 301, 110, 102, 93,
> S: Universe Roots 2 us, workers (us): 2, ---, ---,
> ---, ---, ---, ---, ---,
> S: JNI Handles Roots 4 us, workers (us): ---, 4, ---,
> ---, ---, ---, ---, ---,
> S: JNI Weak Roots 20 us, workers (us): ---, ---, ---,
> ---, ---, 20, ---, ---,
> S: String Table Roots 3374 us, workers (us): 506, 81, 479,
> 487, 372, 482, 473, 494,
> S: Synchronizer Roots 4 us, workers
> (us): 0, 0, 3, 0, 0, 0, 0, 0,
> S: Flat Profiler Roots 5 us, workers (us): ---, 5, ---,
> ---, ---, ---, ---, ---,
> S: Management Roots 1 us, workers (us): 1, ---, ---,
> ---, ---, ---, ---, ---,
> S: System Dict Roots 10 us, workers (us): ---, ---, 10,
> ---, ---, ---, ---, ---,
> S: CLDG Roots 478 us, workers
> (us): 79, 12, 77, 69, 13, 75, 75, 78,
> S: JVMTI Roots 10 us, workers (us): 10, ---, ---,
> ---, ---, ---, ---, ---,
> Resize TLABs 10 us
>
>
> Testing: hotspot_gc_shenandoah, Cassandra benches, eyeballing gc logs
>
More information about the shenandoah-dev
mailing list