What gc.log lines reliably indicate STW pauses for ZGC?

Misha Dmitriev mdmitriev at linkedin.com
Wed Sep 6 19:15:49 UTC 2023


Hi ZGC developers,

I work at LinkedIn, where we have internal tooling for end-to-end GC performance monitoring and analysis. To record GC pauses and other interesting events, we parse GC logs as they are generated, in near real time. Since JDK 11, the unified logging format allowed us to extract STW pauses for all GCs in the same straightforward way: we look for  lines such as:

[2020-03-09T19:12:22.542+0000][info][gc             ] GC(62189) Pause Young (Allocation Failure) 14263M->9334M(14745M) 41.359ms

that is, we check every line for the following condition (we don't use pattern matching for speed):

if (line.endsWith("ms") && line.contains("[gc ") && line.contains("Pause ")) {
  // Extract pause time in ms from the end of the line

However, looks like ZGC, though it uses the unified logging format, has a different format for STW pause lines. From the logs that we have, I am not 100% sure I understand everything. So far looks like the candidates lines look like

[2023-09-05T18:47:27.323+0000][505881.615s][info   ][gc,phases        ] GC(76657) Pause Mark Start 0.024ms

i.e. the condition should be

if (line.endsWith("ms") && line.contains("[gc,phases ") && line.contains("Pause ")) {
  // Extract pause time in ms from the end of line

I wonder if this is all we need or I am missing something?

Also, what's the meaning of lines like these:

[2023-09-05T18:47:44.003+0000][505898.295s][debug  ][gc,phases        ] GC(76661) Pause Mark Try Complete (ZWorker#2) 0.002ms

Formally they look similar to the previous line, but "debug" and "ZWorker#2" suggests that it may not be a STW pause but pausing just one thread? Please advise.

Thank you,

Misha

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230906/3e08636d/attachment-0001.htm>


More information about the zgc-dev mailing list