<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    Hi Evaristo,<br>
    <br>
    Thanks for providing feedback on Generational ZGC. There is a JVM
    flag that can be used to force the JVM to assume a given number of
    cores: -XX:ActiveProcessorCount=<N><br>
    <br>
    From the code:<br>
      product(int, ActiveProcessorCount,
    -1,                                    \<br>
              "Specify the CPU count the VM should use and report as
    active")   \<br>
    <br>
    I've personally never used it before, but I see that when I try it
    that ZGC scales the worker threads accordingly. Maybe this flag
    could be useful for your use-case.<br>
    <br>
    Thanks,<br>
    StefanK<br>
    <br>
    <div class="moz-cite-prefix">On 2023-03-14 08:26, Evaristo José
      Camarero wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:1847774090.2684290.1678778805921@mail.yahoo.com">
      
      <div class="ydp7e7fa21eyahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial,
        sans-serif;font-size:13px;">
        <div dir="ltr" data-setdir="false">Thanks Peter,</div>
        <div dir="ltr" data-setdir="false"><br>
        </div>
        <div dir="ltr" data-setdir="false">This environment is NOT based
          on AWS. It is deployed on a custom K8S flavor on top of an
          OpenStack virtualization layer.</div>
        <div dir="ltr" data-setdir="false"><br>
        </div>
        <div dir="ltr" data-setdir="false">The system has some level of
          CPU overprovisioning, so that is the root cause of the
          problem. App threads + Gen ZGC threads are using more CPU than
          available. We will repeat the tests with more resources to
          avoid the issue.</div>
        <div dir="ltr" data-setdir="false"><br>
        </div>
        <div dir="ltr" data-setdir="false">My previous question is more
          related with ZGC ergonomics, and to fully understand the best
          approach when deploying in Kubernetes.</div>
        <div dir="ltr" data-setdir="false"><br>
        </div>
        <div dir="ltr" data-setdir="false">I saw that Gen ZGC is
          calculating Runtime workers using: Nº CPU * 0,6 and max
          concurrent workers per generation using Nº CPU * 0,25. In a
          K8s POD you can define CPU limit (max CPU potentially
          available for the POD) and CPU request (CPU booked for your
          POD). The JVM is considering the CPU limit for ergonomics
          implementation. In our case both values diverge quite a lot
          (64 CPUs limit vs 32 CPU request), and makes a big difference
          when ZGC decides number of workers. Usually with G1 we tune
          number of ParallelGCThreads and ConcGCThreads in order to
          adapt GC resources. My assumptions with Gen ZGC is that again
          both parameters are the key ones to control used resources by
          the collector.</div>
        <div dir="ltr" data-setdir="false"><br>
        </div>
        <div dir="ltr" data-setdir="false">In our next test we will use:</div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;">ParallelGCThreads = CPU request * 0,6</span></span><br>
        </div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;">ConcGCThreads = CPU request * 0,25</span></span></div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;"><br>
            </span></span></div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;">Under the assumption that system is
              dimension to non surpass the request CPU usage</span></span></div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;"><br>
            </span></span></div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;">Does it make sense? Any other
              suggestion?</span></span></div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;"><br>
            </span></span></div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;">Regards,</span></span></div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;"><br>
            </span></span></div>
        <div dir="ltr" data-setdir="false"><span><span style="color:
              rgb(0, 0, 0); font-family: Helvetica Neue, Helvetica,
              Arial, sans-serif;">Evaristo</span></span></div>
        <div dir="ltr" data-setdir="false"><br>
        </div>
        <div dir="ltr" data-setdir="false"><br>
        </div>
        <div><br>
        </div>
      </div>
      <div id="yahoo_quoted_8891239309" class="yahoo_quoted">
        <div style="font-family:'Helvetica Neue', Helvetica, Arial,
          sans-serif;font-size:13px;color:#26282a;">
          <div> En lunes, 13 de marzo de 2023, 10:18:01 CET, Peter Booth
            <a class="moz-txt-link-rfc2396E" href="mailto:peter_booth@me.com"><peter_booth@me.com></a> escribió: </div>
          <div><br>
          </div>
          <div><br>
          </div>
          <div>
            <div id="yiv2000093508">
              <div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div class="yiv2000093508">The default geode heartbeat
                  timeout interval is 5 seconds, which is an eternity.</div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div class="yiv2000093508">Some points/questions:</div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div class="yiv2000093508">I’d recommend using either
                  Solarflare’s sysjitter tool or Gil Tene’s jhiccup to
                  quantify your OS jitter</div>
                <div class="yiv2000093508">Can you capture the output of
                  vmstat 1 60 ?</div>
                <div class="yiv2000093508">What kind of EC2 instances
                  are you using? Are they RHEL?</div>
                <div class="yiv2000093508">How much physical RAM does
                  each instance have?</div>
                <div class="yiv2000093508">
                  <div class="yiv2000093508">Do you have THP enabled?</div>
                  <div class="yiv2000093508">What is the value of
                    vm.min_free_kbytes ?</div>
                </div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                How often do you see missed heartbeats?
                <div class="yiv2000093508">What length of time do you
                  see the Adjusting Workers message?</div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                </div>
                <div id="yiv2000093508yqt69951" class="yiv2000093508yqt5015505977">
                  <div class="yiv2000093508"><br class="yiv2000093508" clear="none">
                    <div><br class="yiv2000093508" clear="none">
                      <blockquote type="cite" class="yiv2000093508">
                        <div class="yiv2000093508">On Mar 13, 2023, at
                          4:42 AM, Evaristo José Camarero <<a rel="nofollow noopener noreferrer" shape="rect" ymailto="mailto:evaristojosec@yahoo.es" target="_blank" href="mailto:evaristojosec@yahoo.es" class="yiv2000093508 moz-txt-link-freetext" moz-do-not-send="true">evaristojosec@yahoo.es</a>>
                          wrote:</div>
                        <br class="yiv2000093508Apple-interchange-newline" clear="none">
                        <div class="yiv2000093508">
                          <div class="yiv2000093508">
                            <div style="font-family:Helvetica Neue,
                              Helvetica, Arial,
                              sans-serif;font-size:13px;" class="yiv2000093508yahoo-style-wrap">
                              <div dir="ltr" class="yiv2000093508">Hi
                                there,</div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508">We
                                are trying latest ZGC EAB for testing an
                                Apache Geode Cluster (distributed Key
                                Value store with similar use cases that
                                Apache Cassandra)</div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508">We
                                are using K8s, and we have PODs with 32
                                cores request (limit with 60 cores) per
                                data node with 150GB heap per node.</div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508">
                                <div class="yiv2000093508">
                                  <div class="yiv2000093508">          -
                                    -Xmx152000m</div>
                                  <div class="yiv2000093508">          -
                                    -Xms152000m</div>
                                  <div class="yiv2000093508">          -
                                    -XX:+UseZGC</div>
                                  <div class="yiv2000093508">          -
                                    -XX:SoftMaxHeapSize=136000m</div>
                                  <div class="yiv2000093508"><span style="white-space:pre-wrap;" class="yiv2000093508">         </span> -
                                    -XX:ZAllocationSpikeTolerance=4.0 
                                     // We have some spiky workloads
                                    periodically </div>
                                  <div class="yiv2000093508">          -
                                    -XX:+UseNUMA<br class="yiv2000093508" clear="none">
                                  </div>
                                </div>
                                <br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508">ZGC
                                is working great in regard GC pauses
                                with no allocation stalls at all during
                                almost all the time. We observe higher
                                CPU utilization that G1 (Around 20% for
                                a heavy workload using flag <span class="yiv2000093508">-XX:ZAllocationSpikeTolerance=4.0
                                  that maybe is making ZGC more hungry
                                  than needed. We will play further with
                                  this</span>)</div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508">BUT
                                from time to time we see that Geode
                                Clusters are missing heartbeats between
                                and Geode logic shutdowns the JVM of the
                                node with missing heartbeats. We believe
                                that the main problem could be CPU
                                starvation, because some seconds before
                                this is happening we observe ZGC to use
                                more workers for making the job done<br class="yiv2000093508" clear="none">
                                <br class="yiv2000093508" clear="none">
                                <div class="yiv2000093508"><span dir="ltr" class="yiv2000093508ydp64d0df88ui-provider
                                    yiv2000093508ydp64d0df88h
                                    yiv2000093508ydp64d0df88r
                                    yiv2000093508ydp64d0df88k
                                    yiv2000093508ydp64d0df88u
                                    yiv2000093508ydp64d0df88ag
                                    yiv2000093508ydp64d0df88d
                                    yiv2000093508ydp64d0df88ab
                                    yiv2000093508ydp64d0df88n
                                    yiv2000093508ydp64d0df88x
                                    yiv2000093508ydp64d0df88g
                                    yiv2000093508ydp64d0df88q
                                    yiv2000093508ydp64d0df88aj
                                    yiv2000093508ydp64d0df88ae
                                    yiv2000093508ydp64d0df88j
                                    yiv2000093508ydp64d0df88t
                                    yiv2000093508ydp64d0df88c
                                    yiv2000093508ydp64d0df88m
                                    yiv2000093508ydp64d0df88ah
                                    yiv2000093508ydp64d0df88w
                                    yiv2000093508ydp64d0df88ac
                                    yiv2000093508ydp64d0df88f
                                    yiv2000093508ydp64d0df88p
                                    yiv2000093508ydp64d0df88z
                                    yiv2000093508ydp64d0df88i
                                    yiv2000093508ydp64d0df88ak
                                    yiv2000093508ydp64d0df88s
                                    yiv2000093508ydp64d0df88ve
                                    yiv2000093508ydp64d0df88b
                                    yiv2000093508ydp64d0df88af
                                    yiv2000093508ydp64d0df88l
                                    yiv2000093508ydp64d0df88v
                                    yiv2000093508ydp64d0df88e
                                    yiv2000093508ydp64d0df88o
                                    yiv2000093508ydp64d0df88ai
                                    yiv2000093508ydp64d0df88y">[2023-03-12T18:29:38.980+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 2<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:39.781+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 3<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:40.181+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 4<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:40.382+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 5<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:40.582+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 6<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:40.782+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 7<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:40.882+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 8<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:40.982+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 10<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:41.083+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 13<br class="yiv2000093508" clear="none">
                                    [2023-03-12T18:29:41.183+0000]
                                    Adjusting Workers for Young
                                    Generation: 1 -> 16</span></div>
                                <br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508">As
                                commented we are using K8S with PODs
                                that have 32 cores for request and 60
                                cores for limit (and it also true that
                                our K8s workers are close to the limit
                                in CPU utilization). ZGC is assuming on
                                booting that machine has 60 cores (as
                                logged). What is the best way to
                                configure the ZGC to provide a hint to
                                be tuned for a host with 32 cores
                                (basically the 60 cores limit is just to
                                avoid K8s produced throttling)? Is it
                                using ParallelGCThreads flag? Any other
                                thoughts?</div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508">Regards,</div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                              <div dir="ltr" class="yiv2000093508">Evaristo</div>
                              <div dir="ltr" class="yiv2000093508"><br class="yiv2000093508" clear="none">
                              </div>
                            </div>
                          </div>
                        </div>
                      </blockquote>
                    </div>
                    <br class="yiv2000093508" clear="none">
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>