RFR: 8336742: Shenandoah: Add more verbose logging/stats for mark termination attempts
Neethu Prasad
nprasad at openjdk.org
Wed Jul 24 21:17:39 UTC 2024
**Notes**
This PR adds the following
1. info logging on number of SATB flush attempts
2. total time spend on handshaking all threads requesting them to flush their SATB buffers.
As suggested by William in [JDK-8336742 ](https://bugs.openjdk.org/browse/JDK-83367420), we can use handshake logging to get time spend and other stats for each handshake.
[4.515s][info][handshake ] Handshake "Shenandoah Flush SATB Handshake", Targeted threads: 1036, Executed by requesting thread: 1035, Total completion time: 597004 ns
[4.517s][info][handshake ] Handshake "Shenandoah Flush SATB Handshake", Targeted threads: 1036, Executed by requesting thread: 1033, Total completion time: 207402 ns
**Testing**
1. tier1, tier2 and hotspot_gc_shenandoah tests.
2. **-Xlog:gc+stats=info**
[4.058s][info][gc,stats ] Concurrent Marking = 0.080 s (a = 5351 us) (n = 15) (lvls, us = 4746, 5000, 5156, 5684, 5988)
[4.058s][info][gc,stats ] SATB Flush Rendezvous = 0.013 s (a = 860 us) (n = 15) (lvls, us = 764, 814, 836, 885, 961)
[4.058s][info][gc,stats ] Pause Final Mark (G) = 0.058 s (a = 3839 us) (n = 15) (lvls, us = 3047, 3320, 3867, 4121, 4930)
[4.058s][info][gc,stats ] Pause Final Mark (N) = 0.054 s (a = 3592 us) (n = 15) (lvls, us = 2812, 3047, 3574, 3887, 4597)
[4.058s][info][gc,stats ] Finish Mark = 0.028 s (a = 1843 us) (n = 15) (lvls, us = 1602, 1641, 1816, 1934, 2045)
[4.058s][info][gc,stats ] Update Region States = 0.006 s (a = 386 us) (n = 15) (lvls, us = 375, 375, 381, 389, 413)
[4.058s][info][gc,stats ] Choose Collection Set = 0.018 s (a = 1186 us) (n = 15) (lvls, us = 609, 619, 1309, 1387, 2109)
[4.058s][info][gc,stats ] Rebuild Free Set = 0.001 s (a = 43 us) (n = 15) (lvls, us = 40, 41, 42, 43, 53)
[4.058s][info][gc,stats ] Concurrent Weak References = 0.007 s (a = 452 us) (n = 15) (lvls, us = 420, 438, 443, 455, 487)
on app termination
[5.299s][info][gc,stats] GC STATISTICS:
[5.299s][info][gc,stats] "(G)" (gross) pauses include VM time: time to notify and block threads, do the pre-
[5.299s][info][gc,stats] and post-safepoint housekeeping. Use -Xlog:safepoint+stats to dissect.
[5.299s][info][gc,stats] "(N)" (net) pauses are the times spent in the actual GC code.
[5.299s][info][gc,stats] "a" is average time for each phase, look at levels to see if average makes sense.
[5.299s][info][gc,stats] "lvls" are quantiles: 0% (minimum), 25%, 50% (median), 75%, 100% (maximum).
[5.299s][info][gc,stats]
[5.299s][info][gc,stats] All times are wall-clock times, except per-root-class counters, that are sum over
[5.299s][info][gc,stats] all workers. Dividing the <total> over the root stage time estimates parallelism.
[5.299s][info][gc,stats]
[5.299s][info][gc,stats] Pacing delays are measured from entering the pacing code till exiting it. Therefore,
[5.299s][info][gc,stats] observed pacing delays may be higher than the threshold when paced thread spent more
[5.299s][info][gc,stats] time in the pacing code. It usually happens when thread is de-scheduled while paced,
[5.299s][info][gc,stats] OS takes longer to unblock the thread, or JVM experiences an STW pause.
[5.299s][info][gc,stats]
[5.299s][info][gc,stats] Higher delay would prevent application outpacing the GC, but it will hide the GC latencies
[5.299s][info][gc,stats] from the STW pause times. Pacing affects the individual threads, and so it would also be
[5.299s][info][gc,stats] invisible to the usual profiling tools, but would add up to end-to-end application latency.
[5.299s][info][gc,stats] Raise max pacing delay with care.
[5.299s][info][gc,stats]
[5.299s][info][gc,stats] Concurrent Reset = 0.062 s (a = 2224 us) (n = 28) (lvls, us = 1406, 2109, 2207, 2285, 3562)
[5.299s][info][gc,stats] Pause Init Mark (G) = 0.093 s (a = 3311 us) (n = 28) (lvls, us = 3047, 3242, 3281, 3340, 3632)
[5.299s][info][gc,stats] Pause Init Mark (N) = 0.084 s (a = 3018 us) (n = 28) (lvls, us = 2910, 2949, 2988, 3027, 3215)
[5.299s][info][gc,stats] Update Region States = 0.011 s (a = 401 us) (n = 28) (lvls, us = 361, 391, 398, 404, 445)
[5.299s][info][gc,stats] Concurrent Mark Roots = 0.118 s (a = 4207 us) (n = 28) (lvls, us = 4004, 4082, 4160, 4297, 4431)
[5.299s][info][gc,stats] CMR: <total> = 5.520 s (a = 197144 us) (n = 28) (lvls, us = 175781, 183594, 195312, 203125, 253494)
[5.300s][info][gc,stats] CMR: Thread Roots = 5.485 s (a = 195881 us) (n = 28) (lvls, us = 173828, 183594, 193359, 203125, 252361)
[5.300s][info][gc,stats] CMR: VM Strong Roots = 0.014 s (a = 486 us) (n = 28) (lvls, us = 418, 445, 475, 523, 594)
[5.300s][info][gc,stats] CMR: CLDG Roots = 0.022 s (a = 777 us) (n = 28) (lvls, us = 549, 604, 633, 711, 4005)
[5.300s][info][gc,stats] Concurrent Marking = 0.147 s (a = 5241 us) (n = 28) (lvls, us = 4434, 5020, 5254, 5430, 5866)
[5.300s][info][gc,stats] SATB Flush Rendezvous = 0.025 s (a = 905 us) (n = 28) (lvls, us = 746, 816, 846, 883, 1458)
[5.300s][info][gc,stats] Pause Final Mark (G) = 0.110 s (a = 3931 us) (n = 28) (lvls, us = 3066, 3633, 3809, 4062, 6050)
[5.300s][info][gc,stats] Pause Final Mark (N) = 0.101 s (a = 3589 us) (n = 28) (lvls, us = 2754, 3340, 3574, 3730, 4880)
[5.300s][info][gc,stats] Finish Mark = 0.050 s (a = 1793 us) (n = 28) (lvls, us = 1387, 1582, 1777, 1914, 2456)
[5.300s][info][gc,stats] Update Region States = 0.011 s (a = 388 us) (n = 28) (lvls, us = 373, 377, 381, 395, 434)
3. **-Xlog:gc+phases=info**
`[2.735s][info][gc,phases] GC(0) Total SATB flush attempts : 2`
-------------
Commit messages:
- 8336742: Shenandoah: Add more verbose logging/stats for mark termination attempts
Changes: https://git.openjdk.org/jdk/pull/20318/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=20318&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8336742
Stats: 10 lines in 2 files changed: 9 ins; 0 del; 1 mod
Patch: https://git.openjdk.org/jdk/pull/20318.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/20318/head:pull/20318
PR: https://git.openjdk.org/jdk/pull/20318
More information about the hotspot-gc-dev
mailing list