Allocation stalls and Page Cache Flushed messages

Tue Oct 22 09:44:29 UTC 2019

Hi,

On 10/17/19 7:05 AM, Sundara Mohan M wrote:
> Hi,
>     We started using ZGC and seeing this pattern, Whenever there is
> "Allocation Stall" messages before that i see messages like this
> ...
> [2019-10-15T20:02:42.403+0000][71060.202s][info][gc,heap     ] Page Cache
> Flushed: 10M requested, 18M(18530M->18512M) flushed
> [2019-10-15T20:02:42.582+0000][71060.380s][info][gc,heap     ] Page Cache
> Flushed: 12M requested, 18M(17432M->17414M) flushed
> [2019-10-15T20:02:42.717+0000][71060.515s][info][gc,heap     ] Page Cache
> Flushed: 14M requested, 36M(16632M->16596M) flushed
> [2019-10-15T20:02:46.128+0000][71063.927s][info][gc,heap     ] Page Cache
> Flushed: 4M requested, 18M(7742M->7724M) flushed
> [2019-10-15T20:02:49.716+0000][71067.514s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 28M(2374M->2346M) flushed
> [2019-10-15T20:02:49.785+0000][71067.583s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 76M(2346M->2270M) flushed
> [2019-10-15T20:02:49.966+0000][71067.765s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 10M(2270M->2260M) flushed
> [2019-10-15T20:02:50.006+0000][71067.805s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 10M(2260M->2250M) flushed
> [2019-10-15T20:02:50.018+0000][71067.816s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 10M(2250M->2240M) flushed
> [2019-10-15T20:02:50.098+0000][71067.896s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 10M(2240M->2230M) flushed
> [2019-10-15T20:02:50.149+0000][71067.947s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 10M(2230M->2220M) flushed
> [2019-10-15T20:02:50.198+0000][71067.996s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 10M(2220M->2210M) flushed
> [2019-10-15T20:02:50.313+0000][71068.111s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 10M(2210M->2200M) flushed
> [2019-10-15T20:02:50.327+0000][71068.125s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 26M(2200M->2174M) flushed
> [2019-10-15T20:02:50.346+0000][71068.145s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 14M(2174M->2160M) flushed
> [2019-10-15T20:02:50.365+0000][71068.163s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 14M(2160M->2146M) flushed
> [2019-10-15T20:02:50.371+0000][71068.170s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 8M(2146M->2138M) flushed
> [2019-10-15T20:02:50.388+0000][71068.187s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 8M(2138M->2130M) flushed
> [2019-10-15T20:02:50.402+0000][71068.201s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 8M(2130M->2122M) flushed
> [2019-10-15T20:02:50.529+0000][71068.327s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 8M(2122M->2114M) flushed
> [2019-10-15T20:02:50.620+0000][71068.418s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 8M(2114M->2106M) flushed
> [2019-10-15T20:02:50.657+0000][71068.455s][info][gc,heap     ] Page Cache
> Flushed: 2M requested, 14M(2106M->2092M) flushed
> ...
> 
> 1. Any idea what this info message conveys about the system?

This means that ZGC had cached heap regions (ZPages) that were ready to 
be used for new allocations, but they had the wrong size, so some memory 
(one or more ZPage) was flushed out from the cache so that the physical 
memory they occupied could be reused to build a new ZPage of the correct 
size. This is not in itself catastrophic, but it is more expensive than 
a regular allocation as it involves remapping memory. The only real way 
to avoid this is to increase the max heap size.

> 
> 2. Our application is allocating object 5X sometimes and Allocation Stall
> happens immediately after that. Can ZGC heuristic learn about this increase
> and later will it adjust accordingly? Other than increasing memory is there
> any other tuning option to consider that will help in this scenario?

In general, increasing the heap size and/or adjusting the number of 
concurrent work threads (-XX:ConcGCThreads=X) are your main tuning 
options. Using -XX:+UseLargePages is of course also good if you want max 
performance. There are a few other ZGC-specific options, but those 
matter a whole lot less and will likely not help in this situation.

The ZGC wiki has some more information on tuning: 
https://wiki.openjdk.java.net/display/zgc/Main

cheers,
Per

> 
> I am using JDK12 with 80G heap.
> 
> Thanks
> Sundar
>