RFR: 8369283: Improve trace logs in safepoint machinery

Aleksey Shipilev shade at openjdk.org
Tue Oct 7 12:49:14 UTC 2025


On Tue, 7 Oct 2025 12:42:20 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> During the performance investigations in safepoint machinery (notably [JDK-8350324](https://bugs.openjdk.org/browse/JDK-8350324)), I found the trace logging lacking. It would be good to improve it in order to finely profile various microscopic things like thread list walks, the blocking/resuming of Java threads, etc.
> 
> This leans heavily on unified logging to do the right thing. The configuration I found most useful for testing is:
> 
> 
> -Xlog:async -Xlog:safepoint=trace:file=safepoint.log:uptimenanos
> 
> 
> My tentative plan is to visualize this more comprehensively with a little tool that digests that log.
> 
> I am open for bikeshedding on logging wording. The log example is in the comment below.
> 
> Additional testing:
>  - [x] Ad-hoc log peeking
>  - [ ] Linux x86_64 server fastdebug, `tier1`

Example log, plus some comments:


$ ... -Xlog:async -Xlog:safepoint=trace:file=safepoint.log:uptimenanos ...

$ sort safepoint.log | less
[3564874277ns] Safepoint synchronization begins using futex wait barrier
[3564880769ns] Setting thread local yield flag for threads         
<------ polling flags in Java threads are set in 5 us ------->
<------ safepoint machinery makes 1st attempt at synchronizing------>
[3564886400ns] Checking thread status
[3564891580ns] Thread "Worker-0" [0x0000788bcc4af530] is still running
[3564892331ns] Thread "Worker-1" [0x0000788bcc4bb470] is still running
[3564892892ns] Thread "Worker-2" [0x0000788bcc4bbf20] is still running
[3564893734ns] Thread "Worker-3" [0x0000788bcc4bd1e0] is still running
[3564894215ns] Thread "Worker-4" [0x0000788bcc4be7f0] is still running
[3564894666ns] Thread "Worker-5" [0x0000788bcc4bfb90] is still running
[3564895066ns] Thread "Worker-6" [0x0000788bcc4c0f30] is still running
[3564895377ns] Thread "Worker-7" [0x0000788bcc4c22e0] is still running
[3564895918ns] Thread "Allocator-0" [0x0000788bcc4c3d50] is still running
[3564896289ns] Thread "Allocator-1" [0x0000788bcc4c5270] is still running
[3564896740ns] Thread "Allocator-5" [0x0000788bcc4ca490] is still running
[3564896860ns] Waiting for threads to block
<------- first thread responds in 63 us after poll set ----->
[3564943298ns] Blocking thread "Allocator-0" [0x0000788bcc4c3d50]   
[3564945072ns] Blocking thread "Worker-5" [0x0000788bcc4bfb90]
[3564945122ns] Blocking thread "Worker-3" [0x0000788bcc4bd1e0]
[3564945302ns] Blocking thread "Worker-4" [0x0000788bcc4be7f0]
[3564946144ns] Blocking thread "Worker-1" [0x0000788bcc4bb470]
[3564943709ns] Blocking thread "Worker-7" [0x0000788bcc4c22e0]
[3564945943ns] Blocking thread "Worker-6" [0x0000788bcc4c0f30]
<------ safepoint machinery makes 2nd attempt ------>
[3564965701ns] Checking thread status                               
[3564966212ns] Thread "Worker-0" [0x0000788bcc4af530] is still running
[3564966522ns] Thread "Worker-1" [0x0000788bcc4bb470] is now blocked
[3564966743ns] Thread "Worker-2" [0x0000788bcc4bbf20] is still running
[3564966973ns] Thread "Worker-3" [0x0000788bcc4bd1e0] is now blocked
[3564967184ns] Thread "Worker-4" [0x0000788bcc4be7f0] is now blocked
[3564967414ns] Thread "Worker-5" [0x0000788bcc4bfb90] is now blocked
[3564967645ns] Thread "Worker-6" [0x0000788bcc4c0f30] is now blocked
[3564967855ns] Thread "Worker-7" [0x0000788bcc4c22e0] is now blocked
[3564968105ns] Thread "Allocator-0" [0x0000788bcc4c3d50] is now blocked
[3564968316ns] Thread "Allocator-1" [0x0000788bcc4c5270] is still running
[3564968576ns] Thread "Allocator-5" [0x0000788bcc4ca490] is still running
[3564968677ns] Waiting for threads to block
[3564970410ns] Blocking thread "Allocator-5" [0x0000788bcc4ca490]
[3564975119ns] Blocking thread "Worker-2" [0x0000788bcc4bbf20] 
[3564977724ns] Blocking thread "Allocator-1" [0x0000788bcc4c5270]
<------ last thread responds in 103 us after poll set ----->
[3564984196ns] Blocking thread "Worker-0" [0x0000788bcc4af530]     
<------ safepoint machinery makes 3rd attempt ------>
[3565036576ns] Checking thread status
[3564037287ns] Thread "Worker-0" [0x0000788bcc4af530] is now blocked
[3564037658ns] Thread "Worker-2" [0x0000788bcc4bbf20] is now blocked
[3564037898ns] Thread "Allocator-1" [0x0000788bcc4c5270] is now blocked
[3564038189ns] Thread "Allocator-5" [0x0000788bcc4ca490] is now blocked
<------ safepoint machinery detects the last thread with 50 us delay --->
[3564038700ns] Safepoint synchronization finished                  
<------ VM OPERATION RUNS HERE ---->
[3564243099ns] Leaving safepoint
<------ safepoint machinery initiates wakeups --->
<------ first thread experiences 285us total delay --->
[3564253689ns] Resuming thread "Allocator-0" [0x0000788bcc4c3d50]
[3564254010ns] Resuming thread "Worker-1" [0x0000788bcc4bb470]
[3564253729ns] Resuming thread "Worker-4" [0x0000788bcc4be7f0]
[3564254250ns] Resuming thread "Worker-5" [0x0000788bcc4bfb90]
[3564255182ns] Resuming thread "Worker-7" [0x0000788bcc4c22e0]
[3564254871ns] Resuming thread "Worker-3" [0x0000788bcc4bd1e0]
[3564265862ns] Resuming thread "Worker-6" [0x0000788bcc4c0f30]
[3564283796ns] Resuming thread "Allocator-5" [0x0000788bcc4ca490]
[3564392820ns] Safepoint complete
<------ wakeups are complete at this point ----->
[3568389497ns] Resuming thread "Worker-0" [0x0000788bcc4af530]
[3568393484ns] Resuming thread "Worker-2" [0x0000788bcc4bbf20]
<------ last thread experiences 7.5ms total delay --->
[3572390561ns] Resuming thread "Allocator-1" [0x0000788bcc4c5270]

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27673#issuecomment-3376730861


More information about the hotspot-dev mailing list