[8] RFR: Shenandoah: specialize String Table scans for better pause performance

Aleksey Shipilev shade at redhat.com
Tue May 19 09:23:24 UTC 2020


This is sh/jdk8 specific fix. String Table scans dominate some workloads, notably Cassandra. But
they are also visible on simple hello world scenarios. The underlying reason is that parallel
StringTable scans use a very small chunk size, so on decently-sized tables we hit the contended
atomic very often. Larger chunk size is needed to improve performance.

Since that chunk size is hard-coded and entrenched in the 8u upstream code, I chose to copy-paste
the relevant method and specialize it for Shenandoah. 11u and later code handles it better with
OopStorage and concurrent hash tables, so this specialization is only needed for 8u.

8u webrev:
  https://cr.openjdk.java.net/~shade/shenandoah/8u-stringtable-scan/webrev.01/

This yields the impressive improvements on Cassandra perf tests, see example GC cycles:

=== Baseline:

Pause Init Mark (N)           3646 us
  Accumulate Stats              18 us
  Make Parsable                  4 us
  Update Region States          86 us
  Scan Roots                  3478 us
    S: <total>               26681 us
    S: Thread Roots           1521 us, workers (us): 468, 392, 111, 122, 113, 106, 111,  98,
    S: Universe Roots            3 us, workers (us):   3, ---, ---, ---, ---, ---, ---, ---,
    S: JNI Handles Roots         3 us, workers (us):   3, ---, ---, ---, ---, ---, ---, ---,
    S: JNI Weak Roots           25 us, workers (us): ---, ---, ---,  25, ---, ---, ---, ---,
    S: String Table Roots    24644 us, workers (us): 2857, 2935, 3141, 3122, 3148, 3148, 3148, 3145,
    S: Synchronizer Roots        3 us, workers (us):   2,   0,   0,   0,   0,   0,   0,   0,
    S: Flat Profiler Roots       2 us, workers (us): ---, ---,   2, ---, ---, ---, ---, ---,
    S: Management Roots          2 us, workers (us):   2, ---, ---, ---, ---, ---, ---, ---,
    S: System Dict Roots        10 us, workers (us): ---, ---,  10, ---, ---, ---, ---, ---,
    S: CLDG Roots              458 us, workers (us):  12,  13,  75,  72,  74,  75,  65,  72,
    S: JVMTI Roots              10 us, workers (us): ---,  10, ---, ---, ---, ---, ---, ---,
  Resize TLABs                  11 us

=== Patched:
Pause Init Mark (N)            999 us
  Accumulate Stats              17 us
  Make Parsable                  4 us
  Update Region States          92 us
  Scan Roots                   824 us
    S: <total>                5494 us
    S: Thread Roots           1584 us, workers (us): 121, 613, 114, 131, 301, 110, 102,  93,
    S: Universe Roots            2 us, workers (us):   2, ---, ---, ---, ---, ---, ---, ---,
    S: JNI Handles Roots         4 us, workers (us): ---,   4, ---, ---, ---, ---, ---, ---,
    S: JNI Weak Roots           20 us, workers (us): ---, ---, ---, ---, ---,  20, ---, ---,
    S: String Table Roots     3374 us, workers (us): 506,  81, 479, 487, 372, 482, 473, 494,
    S: Synchronizer Roots        4 us, workers (us):   0,   0,   3,   0,   0,   0,   0,   0,
    S: Flat Profiler Roots       5 us, workers (us): ---,   5, ---, ---, ---, ---, ---, ---,
    S: Management Roots          1 us, workers (us):   1, ---, ---, ---, ---, ---, ---, ---,
    S: System Dict Roots        10 us, workers (us): ---, ---,  10, ---, ---, ---, ---, ---,
    S: CLDG Roots              478 us, workers (us):  79,  12,  77,  69,  13,  75,  75,  78,
    S: JVMTI Roots              10 us, workers (us):  10, ---, ---, ---, ---, ---, ---, ---,
  Resize TLABs                  10 us


Testing: hotspot_gc_shenandoah, Cassandra benches, eyeballing gc logs

-- 
Thanks,
-Aleksey



More information about the shenandoah-dev mailing list