Allocation stalls and Page Cache Flushed messages

Fri Oct 25 22:20:22 UTC 2019

Got it.

Thanks
Sundar

On Fri, Oct 25, 2019 at 5:11 AM Per Liden <per.liden at oracle.com> wrote:

> On 10/24/19 7:36 PM, Sundara Mohan M wrote:
> > HI Per,
> >
> >  > This means that ZGC had cached heap regions (ZPages) that were ready
> to
> >  > be used for new allocations, but they had the wrong size, so some
> memory
> >  > (one or more ZPage) was flushed out from the cache so that the
> physical
> >  > memory they occupied could be reused to build a new ZPage of the
> correct
> >  > size.
> >
> > "they had wrong size" does that mean we didn't have a cached page with
> > our new required page size?
> > For ex. We cached few 2M page and now we need 4M page so it will flush
> > multiple pages in cache to get the new page with 4M?
>
> Right, it in this case it would flush two 2M pages, and re-map them as a
> single contiguous 4M page.
>
> /Per
>
> >
> >
> > Thanks
> > Sundar
> >
> > On Tue, Oct 22, 2019 at 2:44 AM Per Liden <per.liden at oracle.com
> > <mailto:per.liden at oracle.com>> wrote:
> >
> >     Hi,
> >
> >     On 10/17/19 7:05 AM, Sundara Mohan M wrote:
> >      > Hi,
> >      >     We started using ZGC and seeing this pattern, Whenever there
> is
> >      > "Allocation Stall" messages before that i see messages like this
> >      > ...
> >      > [2019-10-15T20:02:42.403+0000][71060.202s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 10M requested, 18M(18530M->18512M) flushed
> >      > [2019-10-15T20:02:42.582+0000][71060.380s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 12M requested, 18M(17432M->17414M) flushed
> >      > [2019-10-15T20:02:42.717+0000][71060.515s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 14M requested, 36M(16632M->16596M) flushed
> >      > [2019-10-15T20:02:46.128+0000][71063.927s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 4M requested, 18M(7742M->7724M) flushed
> >      > [2019-10-15T20:02:49.716+0000][71067.514s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 28M(2374M->2346M) flushed
> >      > [2019-10-15T20:02:49.785+0000][71067.583s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 76M(2346M->2270M) flushed
> >      > [2019-10-15T20:02:49.966+0000][71067.765s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 10M(2270M->2260M) flushed
> >      > [2019-10-15T20:02:50.006+0000][71067.805s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 10M(2260M->2250M) flushed
> >      > [2019-10-15T20:02:50.018+0000][71067.816s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 10M(2250M->2240M) flushed
> >      > [2019-10-15T20:02:50.098+0000][71067.896s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 10M(2240M->2230M) flushed
> >      > [2019-10-15T20:02:50.149+0000][71067.947s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 10M(2230M->2220M) flushed
> >      > [2019-10-15T20:02:50.198+0000][71067.996s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 10M(2220M->2210M) flushed
> >      > [2019-10-15T20:02:50.313+0000][71068.111s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 10M(2210M->2200M) flushed
> >      > [2019-10-15T20:02:50.327+0000][71068.125s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 26M(2200M->2174M) flushed
> >      > [2019-10-15T20:02:50.346+0000][71068.145s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 14M(2174M->2160M) flushed
> >      > [2019-10-15T20:02:50.365+0000][71068.163s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 14M(2160M->2146M) flushed
> >      > [2019-10-15T20:02:50.371+0000][71068.170s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 8M(2146M->2138M) flushed
> >      > [2019-10-15T20:02:50.388+0000][71068.187s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 8M(2138M->2130M) flushed
> >      > [2019-10-15T20:02:50.402+0000][71068.201s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 8M(2130M->2122M) flushed
> >      > [2019-10-15T20:02:50.529+0000][71068.327s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 8M(2122M->2114M) flushed
> >      > [2019-10-15T20:02:50.620+0000][71068.418s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 8M(2114M->2106M) flushed
> >      > [2019-10-15T20:02:50.657+0000][71068.455s][info][gc,heap     ]
> >     Page Cache
> >      > Flushed: 2M requested, 14M(2106M->2092M) flushed
> >      > ...
> >      >
> >      > 1. Any idea what this info message conveys about the system?
> >
> >     This means that ZGC had cached heap regions (ZPages) that were ready
> to
> >     be used for new allocations, but they had the wrong size, so some
> >     memory
> >     (one or more ZPage) was flushed out from the cache so that the
> physical
> >     memory they occupied could be reused to build a new ZPage of the
> >     correct
> >     size. This is not in itself catastrophic, but it is more expensive
> than
> >     a regular allocation as it involves remapping memory. The only real
> way
> >     to avoid this is to increase the max heap size.
> >
> >      >
> >      > 2. Our application is allocating object 5X sometimes and
> >     Allocation Stall
> >      > happens immediately after that. Can ZGC heuristic learn about
> >     this increase
> >      > and later will it adjust accordingly? Other than increasing
> >     memory is there
> >      > any other tuning option to consider that will help in this
> scenario?
> >
> >     In general, increasing the heap size and/or adjusting the number of
> >     concurrent work threads (-XX:ConcGCThreads=X) are your main tuning
> >     options. Using -XX:+UseLargePages is of course also good if you want
> >     max
> >     performance. There are a few other ZGC-specific options, but those
> >     matter a whole lot less and will likely not help in this situation.
> >
> >     The ZGC wiki has some more information on tuning:
> >     https://wiki.openjdk.java.net/display/zgc/Main
> >
> >     cheers,
> >     Per
> >
> >      >
> >      > I am using JDK12 with 80G heap.
> >      >
> >      > Thanks
> >      > Sundar
> >      >
> >
>