Measuring STW Pauses
Per Liden
per.liden at oracle.com
Wed Nov 21 09:09:06 UTC 2018
On 11/21/18 9:23 AM, Alex Elent wrote:
> Perfect! Watching the video now. Thanks for your help and have a great
> night. Looking forward to sub 1ms pauses :)
Good! There are a more information and presentations available on the
ZGC wiki, in case you haven't see that:
https://wiki.openjdk.java.net/display/zgc/Main
cheers,
Per
>
> On Wed, Nov 21, 2018, 12:13 AM Per Liden <per.liden at oracle.com
> <mailto:per.liden at oracle.com> wrote:
>
> Hi Alex,
>
> On 11/21/18 8:45 AM, Alex Elent wrote:
> > Hi Per,
> >
> > Thanks for reaching out so quickly. I've subscribed to zgc-dev as
> well.
> >
> > Complete transparency, I'm not that familiar with Java GC.
> Although that
> > was probably made obvious from my previous question.
> >
> > Let me give you some context. I'd like to migrate to ZGC from
> Parallel
> > GC. When using Parallel GC the GC log has clear entries for "Pause"
> > events which I understand to be stop the world pauses. For
> example Pause
> > Full Ergonomics or Pause Full System.gc ().
> >
> > What gc log timings would one need to parse in ZGC logs in order to
> > measure total stop the world pauses?
>
> The numbers you quoted in your initial mail are correct. I.e. they are
> the pauses you had in that GC cycle. So 4.975ms was your longest pause
> there. It's just that taking the sum of the three GC pauses isn't the
> right way to look at it.
>
> If you enable -Xlog:gc*:gc.log you'll also see the "Garbage Collection
> Statistics" table being printed every 10 seconds. There's you'll see
> pause time average and max over time.
>
> I did a presentation at Oracle Code One, where there's a section on how
> to interpret the GC logs. Watching that video might be helpful here.
> It's available here https://www.youtube.com/watch?v=kF_r3GE3zOo and the
> slides here:
> http://cr.openjdk.java.net/~pliden/slides/ZGC-OracleCodeOne-2018.pdf
>
> Also note that when switching from ParallelGC to ZGC, you're switching
> from a collector optimized for throughout to one optimized for low
> latency. Maybe that's exactly what you want (i.e. you want better
> application response times), but keep in mind that these collectors are
> very different and make different trade-offs.
>
> cheers,
> Per
>
> >
> > Thanks again,
> >
> > On Tue, Nov 20, 2018, 11:20 PM Per Liden <per.liden at oracle.com
> <mailto:per.liden at oracle.com>
> > <mailto:per.liden at oracle.com <mailto:per.liden at oracle.com>> wrote:
> >
> > Hi Alex,
> >
> > (note that you need to be subscribed to
> zgc-dev at openjdk.java.net <mailto:zgc-dev at openjdk.java.net>
> > <mailto:zgc-dev at openjdk.java.net
> <mailto:zgc-dev at openjdk.java.net>> to be
> > able to send messages to the list, else a moderator has to
> approve each
> > message).
> >
> > On 11/21/18 12:52 AM, Alex Elent wrote:
> > > Hello,
> > >
> > > I read through JEP 333 and it looks incredibly promising. My
> > question is,
> > > how does one measure the STW pauses in ZGC?
> > >
> > > I read through the mailing list archives and understand
> there are
> > 3 pauses:
> > > Pause Mark Start
> > > Pause Mark End
> > > Pause Relocate Start
> > >
> > > Using the below snippet of gc log:
> > > [2018-11-20T23:34:46.180+0000][5135ms] GC(1) *Pause Mark Start
> > 4.975ms*
> > > [2018-11-20T23:34:46.301+0000][5256ms] GC(1) Concurrent Mark
> > 120.118ms
> > > [2018-11-20T23:34:46.301+0000][5256ms] GC(1) *Pause Mark
> End 0.684ms*
> > > [2018-11-20T23:34:46.304+0000][5259ms] GC(1) Concurrent
> Process
> > Non-Strong
> > > References 2.308ms
> > > [2018-11-20T23:34:46.304+0000][5259ms] GC(1) Concurrent Reset
> > Relocation
> > > Set 0.008ms
> > > [2018-11-20T23:34:46.304+0000][5259ms] GC(1) Concurrent
> Destroy
> > Detached
> > > Pages 0.001ms
> > > [2018-11-20T23:34:46.305+0000][5260ms] GC(1) Concurrent Select
> > Relocation
> > > Set 1.072ms
> > > [2018-11-20T23:34:46.306+0000][5261ms] GC(1) Concurrent
> Prepare
> > Relocation
> > > Set 1.078ms
> > > [2018-11-20T23:34:46.310+0000][5265ms] GC(1) *Pause Relocate
> > Start 3.764ms*
> > >
> > > Does that mean the STW pause for that cycle was 4.975 +
> 0.684 + 3.764
> > > (9.423)?
> >
> > No, the goal of 10ms max pause time refers to each individual
> pause,
> > not
> > the sum of them. The three pauses in ZGC are typically spaced
> out in
> > time (by heavy operations such as concurrent marking and
> > relocation), so
> > treating them as one single pause would not make sense. Note
> that we
> > also track MMU (Minimum Mutator Utilization, also printed in
> the GC
> > log), which reflects pause times and how they are spaced out
> in time.
> >
> > cheers,
> > Per
> >
> > >
> > > Thanks in advance,
> > >
> >
>
More information about the zgc-dev
mailing list