Allocation stalls and Page Cache Flushed messages
Sundara Mohan M
m.sundar85 at gmail.com
Fri Oct 25 22:20:22 UTC 2019
Got it.
Thanks
Sundar
On Fri, Oct 25, 2019 at 5:11 AM Per Liden <per.liden at oracle.com> wrote:
> On 10/24/19 7:36 PM, Sundara Mohan M wrote:
> > HI Per,
> >
> > > This means that ZGC had cached heap regions (ZPages) that were ready
> to
> > > be used for new allocations, but they had the wrong size, so some
> memory
> > > (one or more ZPage) was flushed out from the cache so that the
> physical
> > > memory they occupied could be reused to build a new ZPage of the
> correct
> > > size.
> >
> > "they had wrong size" does that mean we didn't have a cached page with
> > our new required page size?
> > For ex. We cached few 2M page and now we need 4M page so it will flush
> > multiple pages in cache to get the new page with 4M?
>
> Right, it in this case it would flush two 2M pages, and re-map them as a
> single contiguous 4M page.
>
> /Per
>
> >
> >
> > Thanks
> > Sundar
> >
> > On Tue, Oct 22, 2019 at 2:44 AM Per Liden <per.liden at oracle.com
> > <mailto:per.liden at oracle.com>> wrote:
> >
> > Hi,
> >
> > On 10/17/19 7:05 AM, Sundara Mohan M wrote:
> > > Hi,
> > > We started using ZGC and seeing this pattern, Whenever there
> is
> > > "Allocation Stall" messages before that i see messages like this
> > > ...
> > > [2019-10-15T20:02:42.403+0000][71060.202s][info][gc,heap ]
> > Page Cache
> > > Flushed: 10M requested, 18M(18530M->18512M) flushed
> > > [2019-10-15T20:02:42.582+0000][71060.380s][info][gc,heap ]
> > Page Cache
> > > Flushed: 12M requested, 18M(17432M->17414M) flushed
> > > [2019-10-15T20:02:42.717+0000][71060.515s][info][gc,heap ]
> > Page Cache
> > > Flushed: 14M requested, 36M(16632M->16596M) flushed
> > > [2019-10-15T20:02:46.128+0000][71063.927s][info][gc,heap ]
> > Page Cache
> > > Flushed: 4M requested, 18M(7742M->7724M) flushed
> > > [2019-10-15T20:02:49.716+0000][71067.514s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 28M(2374M->2346M) flushed
> > > [2019-10-15T20:02:49.785+0000][71067.583s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 76M(2346M->2270M) flushed
> > > [2019-10-15T20:02:49.966+0000][71067.765s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 10M(2270M->2260M) flushed
> > > [2019-10-15T20:02:50.006+0000][71067.805s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 10M(2260M->2250M) flushed
> > > [2019-10-15T20:02:50.018+0000][71067.816s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 10M(2250M->2240M) flushed
> > > [2019-10-15T20:02:50.098+0000][71067.896s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 10M(2240M->2230M) flushed
> > > [2019-10-15T20:02:50.149+0000][71067.947s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 10M(2230M->2220M) flushed
> > > [2019-10-15T20:02:50.198+0000][71067.996s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 10M(2220M->2210M) flushed
> > > [2019-10-15T20:02:50.313+0000][71068.111s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 10M(2210M->2200M) flushed
> > > [2019-10-15T20:02:50.327+0000][71068.125s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 26M(2200M->2174M) flushed
> > > [2019-10-15T20:02:50.346+0000][71068.145s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 14M(2174M->2160M) flushed
> > > [2019-10-15T20:02:50.365+0000][71068.163s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 14M(2160M->2146M) flushed
> > > [2019-10-15T20:02:50.371+0000][71068.170s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 8M(2146M->2138M) flushed
> > > [2019-10-15T20:02:50.388+0000][71068.187s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 8M(2138M->2130M) flushed
> > > [2019-10-15T20:02:50.402+0000][71068.201s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 8M(2130M->2122M) flushed
> > > [2019-10-15T20:02:50.529+0000][71068.327s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 8M(2122M->2114M) flushed
> > > [2019-10-15T20:02:50.620+0000][71068.418s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 8M(2114M->2106M) flushed
> > > [2019-10-15T20:02:50.657+0000][71068.455s][info][gc,heap ]
> > Page Cache
> > > Flushed: 2M requested, 14M(2106M->2092M) flushed
> > > ...
> > >
> > > 1. Any idea what this info message conveys about the system?
> >
> > This means that ZGC had cached heap regions (ZPages) that were ready
> to
> > be used for new allocations, but they had the wrong size, so some
> > memory
> > (one or more ZPage) was flushed out from the cache so that the
> physical
> > memory they occupied could be reused to build a new ZPage of the
> > correct
> > size. This is not in itself catastrophic, but it is more expensive
> than
> > a regular allocation as it involves remapping memory. The only real
> way
> > to avoid this is to increase the max heap size.
> >
> > >
> > > 2. Our application is allocating object 5X sometimes and
> > Allocation Stall
> > > happens immediately after that. Can ZGC heuristic learn about
> > this increase
> > > and later will it adjust accordingly? Other than increasing
> > memory is there
> > > any other tuning option to consider that will help in this
> scenario?
> >
> > In general, increasing the heap size and/or adjusting the number of
> > concurrent work threads (-XX:ConcGCThreads=X) are your main tuning
> > options. Using -XX:+UseLargePages is of course also good if you want
> > max
> > performance. There are a few other ZGC-specific options, but those
> > matter a whole lot less and will likely not help in this situation.
> >
> > The ZGC wiki has some more information on tuning:
> > https://wiki.openjdk.java.net/display/zgc/Main
> >
> > cheers,
> > Per
> >
> > >
> > > I am using JDK12 with 80G heap.
> > >
> > > Thanks
> > > Sundar
> > >
> >
>
More information about the zgc-dev
mailing list