[Question][ZGC] handshakeAllThreads vs. Per-Thread Handshakes During Mark Phase Termination

Thu Sep 11 09:58:05 UTC 2025

Hello everyone,
I’m trying to better understand how ZGC performs thread coordination at the end of the marking phase.
My questions are:

  *   Is handshakeAllThreads used at the end of the mark phase in ZGC?
  *   If so, what is the difference between thread-local handshakes (i.e., handshaking with one thread at a time) and handshakeAllThreads?
  *   Are thread-local handshaking and handshakeAllThreads fundamentally different mechanisms, or just variations of the same mechanism?
  *   What are the trade-offs or reasons for choosing one over the other in this context?
Background and observations:
While reading the latest ZGC source code, I came across the following call chain near the end of the mark phase:
https://github.com/openjdk/jdk/blob/f4d73d2a3dbeccfd04d49c0cfd690086edd0544f/src/hotspot/share/gc/z/zRemembered.cpp#L561C1-L561C49
ZRemembered::scan_and_follow(ZMark* mark)
→ ZMark::try_terminate_flush()
→ ZMark::flush()
→ Handshake::execute()
This led me to notice that handshakeAllThreads appears to be used during this process.
I was curious about this, as I wondered whether using a global handshake (with all threads) might contribute to observable latency in some cases-although I’m not sure how significant this might be in practice.
In contrast, in the following paper:
Albert Mingkun Yang and Tobias Wrigstad,
“Deep Dive into ZGC: A Modern Garbage Collector in OpenJDK”,
Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction, 2022.
https://dl.acm.org/doi/10.1145/3538532
It is stated in Section 3.4 (STW2: The End of the Marking Phase) that:
"... thread-local handshaking with each mutator (one mutator at a time) is performed to check for the presence of any to-be-marked objects before attempting an STW pause; this reduces the probability of entering STW2 prematurely."
This seems to suggest a more incremental approach (per-thread handshaking), which could be helpful in minimizing pauses.
Hence my questions above - I’d appreciate any clarification about the actual behavior and the design decisions behind the use of handshakeAllThreads vs. thread-local handshakes.
Thank you!

==================
NTT R&D
Oh Sato
oh.sato at ntt.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20250911/be0e41a3/attachment.htm>