RFR: 8369283: Improve trace logs in safepoint machinery [v2]
David Holmes
dholmes at openjdk.org
Wed Oct 8 05:48:11 UTC 2025
On Tue, 7 Oct 2025 15:40:31 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> During the performance investigations in safepoint machinery (notably [JDK-8350324](https://bugs.openjdk.org/browse/JDK-8350324)), I found the trace logging lacking. It would be good to improve it in order to finely profile various microscopic things like thread list walks, the blocking/resuming of Java threads, etc. For [JDK-8350324](https://bugs.openjdk.org/browse/JDK-8350324), for example, I want to be able to measure the Java thread delays if they assist in avalanche wakeups.
>>
>> This leans heavily on unified logging and invariant clocks to do the right thing, but I think it is a fair compromise for simplicity. The configuration I found most useful for testing is:
>>
>>
>> -Xlog:async -Xlog:safepoint=trace:file=safepoint.log:uptimenanos
>>
>>
>> My tentative plan is to visualize this more comprehensively with a little tool that digests that log.
>>
>> I am open for bikeshedding on logging wording. The log example is in the comment below.
>>
>> Additional testing:
>> - [x] Ad-hoc log peeking
>> - [x] Linux x86_64 server fastdebug, `tier1`
>> - [ ] Linux x86_64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>
> Cannot allocate for Thread::name + adjust debug levels
Generally seems okay but a few queries.
src/hotspot/share/runtime/safepoint.cpp line 169:
> 167: LogStream ls(lt);
> 168: cur_state->print_on(&ls);
> 169: }
Why is this not useful for debugging? I can understand you might not need it for your profiling but UL is primarily about debugging assistance.
src/hotspot/share/runtime/safepoint.cpp line 261:
> 259:
> 260: if (still_running > 0) {
> 261: log_trace(safepoint)("Waiting for threads to block");
Suggestion:
log_trace(safepoint)("Waiting for %d threads to block", still_running);
?
src/hotspot/share/runtime/safepoint.cpp line 348:
> 346: _nof_threads_hit_polling_page = 0;
> 347:
> 348: log_debug(safepoint)("Safepoint synchronization initiated using %s wait barrier. (%d threads)", _wait_barrier->description(), nof_threads);
I don't see the point in moving this partially into `SafepointTracing::begin` but losing the information as to how many threads are being handled.
src/hotspot/share/runtime/sharedRuntime.cpp line 643:
> 641: return raw_exception_handler_for_return_address(current, return_address);
> 642: JRT_END
> 643:
Again this seems useful for debugging.
-------------
PR Review: https://git.openjdk.org/jdk/pull/27673#pullrequestreview-3312869992
PR Review Comment: https://git.openjdk.org/jdk/pull/27673#discussion_r2412549283
PR Review Comment: https://git.openjdk.org/jdk/pull/27673#discussion_r2412529765
PR Review Comment: https://git.openjdk.org/jdk/pull/27673#discussion_r2412543617
PR Review Comment: https://git.openjdk.org/jdk/pull/27673#discussion_r2412572697
More information about the hotspot-dev
mailing list