What gc.log lines reliably indicate STW pauses for ZGC?
Stefan Karlsson
stefan.karlsson at oracle.com
Fri Sep 8 06:08:29 UTC 2023
Hi Misha,
On 2023-09-06 21:15, Misha Dmitriev wrote:
> Hi ZGC developers,
>
> I work at LinkedIn, where we have internal tooling for end-to-end GC
> performance monitoring and analysis. To record GC pauses and other
> interesting events, we parse GC logs as they are generated, in near
> real time. Since JDK 11, the unified logging format allowed us to
> extract STW pauses for all GCs in the same straightforward way: we
> look for lines such as:
>
> [2020-03-09T19:12:22.542+0000][info][gc ] GC(62189) Pause
> Young (Allocation Failure) 14263M->9334M(14745M) 41.359ms
>
> that is, we check every line for the following condition (we don't use
> pattern matching for speed):
>
> if (line.endsWith("ms") && line.contains("[gc ") &&
> line.contains("Pause ")) {
> // Extract pause time in ms from the end of the line
>
> However, looks like ZGC, though it uses the unified logging format,
> has a different format for STW pause lines. From the logs that we
> have, I am not 100% sure I understand everything. So far looks like
> the candidates lines look like
>
> [2023-09-05T18:47:27.323+0000][505881.615s][info ][gc,phases ]
> GC(76657) Pause Mark Start 0.024ms
>
> i.e. the condition should be
>
> if (line.endsWith("ms") && line.contains("[gc,phases ") &&
> line.contains("Pause ")) {
> // Extract pause time in ms from the end of line
>
> I wonder if this is all we need or I am missing something?
I think you also need to search for 'info' to make sure you don't get
the more detailed breakdown of the pauses.
>
> Also, what's the meaning of lines like these:
>
> [2023-09-05T18:47:44.003+0000][505898.295s][debug ][gc,phases ]
> GC(76661) Pause Mark Try Complete (ZWorker#2) 0.002ms
>
> Formally they look similar to the previous line, but "debug" and
> "ZWorker#2" suggests that it may not be a STW pause but pausing just
> one thread? Please advise.
This line prints what one of the GC worker threads was doing during the
Mark End pause. The "(ZWorker#2)" indicates that this a GC worker
thread. You should filter out these lines if you try to collect the GC
pauses.
FWIW, in the version you are running this line is printed on the 'debug'
level. However, in the latest source code this has changed and worker
logging has been moved to the 'trace' level. There we have this breakdown:
1) 'info' logs a full pause or concurrent phase
2) 'debug' logs a subphase of (1)
3) 'trace' logs what individual worker threads are doing
HTH,
StefanK
>
> Thank you,
>
> Misha
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230908/56fbcdd4/attachment-0001.htm>
More information about the zgc-dev
mailing list