RFR: 8350313: Include timings for leaving safepoint in safepoint logging
Aleksey Shipilev
shade at openjdk.org
Tue Feb 25 09:13:00 UTC 2025
On Mon, 24 Feb 2025 20:48:25 GMT, Xiaolong Peng <xpeng at openjdk.org> wrote:
> Hi there,
> Please help with reviewing the PR. The change is to improve the observability of timings of safepoints. We have a production use case where leaving the safepoint introduced a significant latency, mostly due to VMThread getting de-scheduled, more details can be find in JBS bug [JDK-8350324](https://bugs.openjdk.org/browse/JDK-8350324).
>
> Current safepoint log is like:
>
> [3.427s][info][gc ] GC(6) Pause Final Mark (unload classes) 0.201ms
> [3.428s][info][safepoint] Safepoint "ShenandoahFinalMarkStartEvac", Time since last: 2463125 ns, Reaching safepoint: 14958 ns, At safepoint: 410834 ns, Total: 425792 ns
>
>
>
> The "At safepoint" time include the time to evaluate the VM operation and the time to leave safepoint(cleanup, disarm WaitBarrier..etc.), from safepoint log we don't really know how much time it takes to leave safepoint.
>
> As proposed in [JDK-8350313](https://bugs.openjdk.org/browse/JDK-8350313), we think a "Leaving safepoint" counter just like "Reaching safepoint" could be nice to have. It comes with the symmetry advantage with "Reaching safepoint" counter. "Leaving safepoint" measures the time spent in safepoint machinery from the "finishing side". And, it more clearly captures the transitional state where some threads might be still at safepoint, and some have already unparked, like "Reaching safepoint".
>
> With the new "Leaving safepoint" counter, the log will be like:
>
> [17.764s][info][gc ] GC(84) Pause Final Mark (unload classes) 0.179ms
> [17.764s][info][safepoint] Safepoint "ShenandoahFinalMarkStartEvac", Time since last: 3340792 ns, Reaching safepoint: 31250 ns, At safepoint: 187958 ns, Leaving safepoint: 177833 ns, Total: 397041 ns
>
>
> ### Tests
> - [x] Verified the new safepoint log
> - [x] Tier 2
Looks good conceptually, I think we just need better names.
src/hotspot/share/runtime/safepoint.cpp line 870:
> 868: jlong SafepointTracing::_last_safepoint_begin_time_ns = 0;
> 869: jlong SafepointTracing::_last_safepoint_sync_time_ns = 0;
> 870: jlong SafepointTracing::_last_vmop_evaluation_end_time_ns = 0;
>From the adjacent names, I think it should be `_last_safepoint_leave_time_ns`.
src/hotspot/share/runtime/safepoint.hpp line 255:
> 253: static void begin(VM_Operation::VMOp_Type type);
> 254: static void synchronized(int nof_threads, int nof_running, int traps);
> 255: static void end_vmop_evaluation();
I think of a little better name: `leave()`?
-------------
PR Review: https://git.openjdk.org/jdk/pull/23756#pullrequestreview-2640156507
PR Review Comment: https://git.openjdk.org/jdk/pull/23756#discussion_r1969293400
PR Review Comment: https://git.openjdk.org/jdk/pull/23756#discussion_r1969292458
More information about the hotspot-runtime-dev
mailing list