Gen ZGC EAB in K8S

Tue Mar 14 09:08:35 UTC 2023

 Thanks a lot Stefan.
I think this flag makes much more sense that playing with GC Parallel and Concurrent threads!!!!
Some extra feedback that maybe is relevant.
I reported that ZGC was using 20% more CPU than G1 for the same workload, BUT that was not totally true in all cases. I decided to decrease the traffic received by the Geode cluster, and for the same amount of traffic the ZGC was using around 10% more CPU than G1 (21 cores versus 19 cores). G1 was configured to target 25 msecs pause times while concurrent ZGC was doing an amazing job. This has great benefits in our system, because Apache Geode client API is blocking and the GC pauses are somehow amplified by the rest of the system. So when G1 is pausing the app, we observe collateral effect like clients expanding connection pools and similar things.
Average response latencies provided by the Geode cluster were aligned with G1 or even better, and for sure more predictable.
I can only say nice things. It is working fine, no qualities issues found in our use cases. It is NOT as frugal as G1 (expected) BUT looks much better than non Generational ZGC in our case. I will share more feedback when we run more tests if you are interested.
I saw that JEP is now candidate (congrats to the ZGC team for the hard work).
Regards,
Evaristo    En martes, 14 de marzo de 2023, 09:14:14 CET, Stefan Karlsson <stefan.karlsson at oracle.com> escribió:  

  Hi Evaristo,

 Thanks for providing feedback on Generational ZGC. There is a JVM flag that can be used to force the JVM to assume a given number of cores: -XX:ActiveProcessorCount=<N>

 From the code:
   product(int, ActiveProcessorCount, -1,                                    \
           "Specify the CPU count the VM should use and report as active")   \

 I've personally never used it before, but I see that when I try it that ZGC scales the worker threads accordingly. Maybe this flag could be useful for your use-case.

 Thanks,
 StefanK

 On 2023-03-14 08:26, Evaristo José Camarero wrote:

  Thanks Peter, 
  This environment is NOT based on AWS. It is deployed on a custom K8S flavor on top of an OpenStack virtualization layer. 
  The system has some level of CPU overprovisioning, so that is the root cause of the problem. App threads + Gen ZGC threads are using more CPU than available. We will repeat the tests with more resources to avoid the issue. 
  My previous question is more related with ZGC ergonomics, and to fully understand the best approach when deploying in Kubernetes. 
  I saw that Gen ZGC is calculating Runtime workers using: Nº CPU * 0,6 and max concurrent workers per generation using Nº CPU * 0,25. In a K8s POD you can define CPU limit (max CPU potentially available for the POD) and CPU request (CPU booked for your POD). The JVM is considering the CPU limit for ergonomics implementation. In our case both values diverge quite a lot (64 CPUs limit vs 32 CPU request), and makes a big difference when ZGC decides number of workers. Usually with G1 we tune number of ParallelGCThreads and ConcGCThreads in order to adapt GC resources. My assumptions with Gen ZGC is that again both parameters are the key ones to control used resources by the collector. 
  In our next test we will use: ParallelGCThreads = CPU request * 0,6
  ConcGCThreads = CPU request * 0,25 
  Under the assumption that system is dimension to non surpass the request CPU usage 
  Does it make sense? Any other suggestion? 
  Regards, 
  Evaristo 

      En lunes, 13 de marzo de 2023, 10:18:01 CET, Peter Booth <peter_booth at me.com> escribió:  

  The default geode heartbeat timeout interval is 5 seconds, which is an eternity. 
  Some points/questions: 
  I’d recommend using either Solarflare’s sysjitter tool or Gil Tene’s jhiccup to quantify your OS jitter Can you capture the output of vmstat 1 60 ? What kind of EC2 instances are you using? Are they RHEL? How much physical RAM does each instance have?  Do you have THP enabled? What is the value of vm.min_free_kbytes ?  

  How often do you see missed heartbeats? What length of time do you see the Adjusting Workers message? 

 On Mar 13, 2023, at 4:42 AM, Evaristo José Camarero <evaristojosec at yahoo.es> wrote: 
    Hi there, 
  We are trying latest ZGC EAB for testing an Apache Geode Cluster (distributed Key Value store with similar use cases that Apache Cassandra) 
  We are using K8s, and we have PODs with 32 cores request (limit with 60 cores) per data node with 150GB heap per node. 
              - -Xmx152000m           - -Xms152000m           - -XX:+UseZGC           - -XX:SoftMaxHeapSize=136000m    - -XX:ZAllocationSpikeTolerance=4.0   // We have some spiky workloads periodically            - -XX:+UseNUMA

  ZGC is working great in regard GC pauses with no allocation stalls at all during almost all the time. We observe higher CPU utilization that G1 (Around 20% for a heavy workload using flag -XX:ZAllocationSpikeTolerance=4.0 that maybe is making ZGC more hungry than needed. We will play further with this) 

  BUT from time to time we see that Geode Clusters are missing heartbeats between and Geode logic shutdowns the JVM of the node with missing heartbeats. We believe that the main problem could be CPU starvation, because some seconds before this is happening we observe ZGC to use more workers for making the job done

 [2023-03-12T18:29:38.980+0000] Adjusting Workers for Young Generation: 1 -> 2
 [2023-03-12T18:29:39.781+0000] Adjusting Workers for Young Generation: 1 -> 3
 [2023-03-12T18:29:40.181+0000] Adjusting Workers for Young Generation: 1 -> 4
 [2023-03-12T18:29:40.382+0000] Adjusting Workers for Young Generation: 1 -> 5
 [2023-03-12T18:29:40.582+0000] Adjusting Workers for Young Generation: 1 -> 6
 [2023-03-12T18:29:40.782+0000] Adjusting Workers for Young Generation: 1 -> 7
 [2023-03-12T18:29:40.882+0000] Adjusting Workers for Young Generation: 1 -> 8
 [2023-03-12T18:29:40.982+0000] Adjusting Workers for Young Generation: 1 -> 10
 [2023-03-12T18:29:41.083+0000] Adjusting Workers for Young Generation: 1 -> 13
 [2023-03-12T18:29:41.183+0000] Adjusting Workers for Young Generation: 1 -> 16 

  As commented we are using K8S with PODs that have 32 cores for request and 60 cores for limit (and it also true that our K8s workers are close to the limit in CPU utilization). ZGC is assuming on booting that machine has 60 cores (as logged). What is the best way to configure the ZGC to provide a hint to be tuned for a host with 32 cores (basically the 60 cores limit is just to avoid K8s produced throttling)? Is it using ParallelGCThreads flag? Any other thoughts? 
  Regards, 
  Evaristo 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/zgc-dev/attachments/20230314/8f870571/attachment-0001.htm>