Discussion on ZGC's Page Cache Flush

Fri Jun 5 10:52:03 UTC 2020

Hi,

On 6/5/20 11:24 AM, Hao Tang wrote:
> 
> Hi ZGC Team,
> 
> We encountered "Page Cache Flushed" when we enable ZGC feature. Much longer response time can be observed at the time when "Page Cache Flushed" happened. There is a case that is able to reproduce this scenario. In this case, medium-sized objects are periodically cleaned up. Right after the clean-up, small pages is not sufficient for allocating small-sized objects, which needs to flush medium pages into small pages. We found that simply enlarging the max heap size cannot solve this problem. We believe that "page cache flush" issue could be a general problem, because the ratio of small/medium/large objects are not always constant.
> 
> Sample code:
> import java.util.Random;
> import java.util.concurrent.locks.LockSupport;
> public class TestPageCacheFlush {
>      /*
>       * Options: -XX:+UnlockExperimentalVMOptions -XX:+UseZGC -XX:+UnlockDiagnosticVMOptions -Xms10g -Xmx10g -XX:ParallelGCThreads=2 -XX:ConcGCThreads=4 -Xlog:gc,gc+heap
>       * small object: fast allocation
>       * medium object: slow allocation, periodic deletion
>       */
>      public static void main(String[] args) throws Exception {
>          long heapSizeKB = Runtime.getRuntime().totalMemory() >> 10;
>          System.out.println(heapSizeKB);
>          SmallContainer smallContainer = new SmallContainer((long)(heapSizeKB * 0.4));     // 40% heap for live small objects
>          MediumContainer mediumContainer = new MediumContainer((long)(heapSizeKB * 0.4));  // 40% heap for live medium objects
>          int totalSmall = smallContainer.getTotalObjects();
>          int totalMedium = mediumContainer.getTotalObjects();
>          int addedSmall = 0;
>          int addedMedium = 1; // should not be divided by zero
>          while (addedMedium < totalMedium * 10) {
>              if (totalSmall / totalMedium > addedSmall / addedMedium) { // keep the ratio of allocated small/medium objects
>                  smallContainer.createAndSaveObject();
>                  addedSmall ++;
>              } else {
>                  mediumContainer.createAndAppendObject();
>                  addedMedium ++;
>              }
>              if ((addedSmall + addedMedium) % 50 == 0) {
>                  LockSupport.parkNanos(500); // make allocation slower
>              }
>          }
>      }
>      static class SmallContainer {
>          private final int KB_PER_OBJECT = 64; // 64KB per object
>          private final Random RANDOM = new Random();
>          private byte[][] smallObjectArray;
>          private long totalKB;
>          private int totalObjects;
>          SmallContainer(long totalKB) {
>              this.totalKB = totalKB;
>              totalObjects = (int)(totalKB / KB_PER_OBJECT);
>              smallObjectArray = new byte[totalObjects][];
>          }
>          int getTotalObjects() {
>              return totalObjects;
>          }
>          // random insertion (with random deletion)
>          void createAndSaveObject() {
>              smallObjectArray[RANDOM.nextInt(totalObjects)] = new byte[KB_PER_OBJECT << 10];
>          }
>      }
>      static class MediumContainer {
>          private final int KB_PER_OBJECT = 512; // 512KB per object
>          private byte[][] mediumObjectArray;
>          private int mediumObjectArrayCurrentIndex = 0;
>          private long totalKB;
>          private int totalObjects;
>          MediumContainer(long totalKB) {
>              this.totalKB = totalKB;
>              totalObjects = (int)(totalKB / KB_PER_OBJECT);
>              mediumObjectArray = new byte[totalObjects][];
>          }
>          int getTotalObjects() {
>              return totalObjects;
>          }
>          void createAndAppendObject() {
>              if (mediumObjectArrayCurrentIndex == totalObjects) { // periodic deletion
>                  mediumObjectArray = new byte[totalObjects][]; // also delete all medium objects in the old array
>                  mediumObjectArrayCurrentIndex = 0;
>              } else {
>                  mediumObjectArray[mediumObjectArrayCurrentIndex] = new byte[KB_PER_OBJECT << 10];
>                  mediumObjectArrayCurrentIndex ++;
>              }
>          }
>      }
> }
> 
> To avoid "page cache flush", we made a patch for converting small/medium pages to medium/small pages ahead of time. This patch works well on an application with relatively-stable allocation rate, which has not encountered throughput problem. How do you think of this solution?
> 
> We notice that you are improving the efficiency for map/unmap operations (https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2020-June/029936.html). It may be a step for improving the delay caused by "page cache flush". Do you have further plan for eliminating or improving "page cache flush"?

Yes, and as you might have seen, the latest incarnation of this patchset 
includes asynchronous unmapping, which helps reduce the time for page 
cache flushing. I ran your example program above, with these patches and 
can see ~30% reduction in average page allocation time, and ~60% 
reduction in worst case page allocation time. So, it will be an improvement.

However, I'd be more than happy to take a look at your patch and see 
what you've done. Making page cache flushing even less expensive is 
something we're interested in going forward.

cheers,
Per

> 
> Sincerely,Hao Tang
>