RFR: 8369283: Improve trace logs in safepoint machinery [v2]

David Holmes dholmes at openjdk.org
Wed Oct 8 05:48:11 UTC 2025


On Tue, 7 Oct 2025 15:40:31 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

>> During the performance investigations in safepoint machinery (notably [JDK-8350324](https://bugs.openjdk.org/browse/JDK-8350324)), I found the trace logging lacking. It would be good to improve it in order to finely profile various microscopic things like thread list walks, the blocking/resuming of Java threads, etc. For [JDK-8350324](https://bugs.openjdk.org/browse/JDK-8350324), for example, I want to be able to measure the Java thread delays if they assist in avalanche wakeups.
>> 
>> This leans heavily on unified logging and invariant clocks to do the right thing, but I think it is a fair compromise for simplicity. The configuration I found most useful for testing is:
>> 
>> 
>> -Xlog:async -Xlog:safepoint=trace:file=safepoint.log:uptimenanos
>> 
>> 
>> My tentative plan is to visualize this more comprehensively with a little tool that digests that log.
>> 
>> I am open for bikeshedding on logging wording. The log example is in the comment below.
>> 
>> Additional testing:
>>  - [x] Ad-hoc log peeking
>>  - [x] Linux x86_64 server fastdebug, `tier1`
>>  - [ ] Linux x86_64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Cannot allocate for Thread::name + adjust debug levels

Generally seems okay but a few queries.

src/hotspot/share/runtime/safepoint.cpp line 169:

> 167:     LogStream ls(lt);
> 168:     cur_state->print_on(&ls);
> 169:   }

Why is this not useful for debugging? I can understand you might not need it for your profiling but UL is primarily about debugging assistance.

src/hotspot/share/runtime/safepoint.cpp line 261:

> 259: 
> 260:     if (still_running > 0) {
> 261:       log_trace(safepoint)("Waiting for threads to block");

Suggestion:

      log_trace(safepoint)("Waiting for %d threads to block", still_running);

?

src/hotspot/share/runtime/safepoint.cpp line 348:

> 346:   _nof_threads_hit_polling_page = 0;
> 347: 
> 348:   log_debug(safepoint)("Safepoint synchronization initiated using %s wait barrier. (%d threads)", _wait_barrier->description(), nof_threads);

I don't see the point in moving this partially into `SafepointTracing::begin` but losing the information as to how many threads are being handled.

src/hotspot/share/runtime/sharedRuntime.cpp line 643:

> 641:   return raw_exception_handler_for_return_address(current, return_address);
> 642: JRT_END
> 643: 

Again this seems useful for debugging.

-------------

PR Review: https://git.openjdk.org/jdk/pull/27673#pullrequestreview-3312869992
PR Review Comment: https://git.openjdk.org/jdk/pull/27673#discussion_r2412549283
PR Review Comment: https://git.openjdk.org/jdk/pull/27673#discussion_r2412529765
PR Review Comment: https://git.openjdk.org/jdk/pull/27673#discussion_r2412543617
PR Review Comment: https://git.openjdk.org/jdk/pull/27673#discussion_r2412572697


More information about the hotspot-dev mailing list