Container-aware heap sizing for OpenJDK

Volker Simonis volker.simonis at gmail.com
Mon Oct 17 16:42:56 UTC 2022


On Thu, Sep 22, 2022 at 11:16 AM Man Cao <manc at google.com> wrote:
>
> Hi all,
>
> Great to see so many responses in this thread! I work with Jonathan and reviewed all design and implementation aspects of AHS.
>
> Adding a few more comments to Jonathan's response:
> We were aware of the CurrentMaxHeapSize [1] and SoftMaxHeapSize [2] proposals when we started the design of AHS. We designed AHS so that its main logic is outside of the JVM, and the main logic only interacts with the JVM through manageable JVM flags. In this way we could easily converge on using CurrentMaxHeapSize / SoftMaxHeapSize for AHS once they are implemented. Our AHS implementation has to target JDK 11 for our internal deployment, so we have to implement our own version of "Current maximum heap expansion size" and "Current target heap size" in JDK 11 first.
>
> Now it seems a great time to get [1] and [2] implemented, so AHS can actually converge. Are there recent prototypes for them? What are the main problems in those prototypes? Jonathan and I could take a look and help get them implemented.
>
> For the AHS main logic that sets the manageable JVM flags, currently it resides in a native thread started by our own Java launcher. [3] has a description on what our launcher does.
> However, the main logic could be implemented in other places, such as a Java agent, a JNI library, or another process. We could do some rework to make the main logic a JNI library, then open-source it.
> We also agree with Thomas that the AHS main logic is not suitable to be integrated into the JVM, as it varies greatly depending on the deployment environment (e.g., the container technology, OS, additional external signals that should be taken into account).
>
> Besides [1] and [2], another required JVM feature is monitoring metrics for GC CPU time. We added hsperfdata counters in our JDK 11 for this, and I will create RFEs to upstream them. After all, it seems unlikely that AHS needs another JEP.
>

Hi Man, Jonathan,

You're certainly aware of the existing "sun.gc.collector.{0,1,2}.time"
metric. Do I understand you correctly that this (I think it is "wall
clock time") was not enough for your needs and you introduced a new GC
CPU time metric? Is it the sum of CPU time for all the GC threads? I
think for concurrent GCs like G1 that makes much more sense.

I'd be very interested in a RFE/PR with this change.

Thank you and best regards,
Volker

> @Thomas
> > SPECjbb2015 is a standard industry Java VM performance benchmark
> > (https://www.spec.org/jbb2015/). Unfortunately it's commercial.
> > ...
> > b) and during measurement phase, SPECjbb2015 has a peculiar way of
> > trying to find maximum sustainable load.
> > ...
> Thanks for the details on specjbb2015!
> We did try SPECjbb2015 for benchmarking earlier (Jonathan was not involved at that time), but did not pursue running it regularly as we did not find it representative enough to our typical server workload.
> The peculiar way of finding maximum load (sudden allocation spike after a quiescence period) does sound problematic for most GC heuristics. It does NOT sound like an important case to optimize in the real world. Real-world server workload typically ramps up and ramps down gradually, corresponding to daily traffic patterns. Batch and desktop applications are more likely to have phased behavior, but such phase changes should only last for a short, temporary period.
>
>
> @Kirk for tuning young-gen vs old-gen sizing with AHS:
> I agree that manually tuning young-gen and old-gen sizing would give the applications the optimal GC performance, i.e. low GC CPU overhead.
> However, the goal of AHS is not to optimize GC CPU or pause performance. AHS's goal is to make JVM's memory behavior play nice in a containerized environment, at a small cost of GC CPU or pause overhead. So I'm not sure if it is a good idea or feasible to make AHS configure young/old sizing as well. It would require making JVM flags such as MaxGCPauseMillis, NewSize, MaxNewSize, InitiatingHeapOccupancyPercent manageable as well, which may be challenging by themselves.
>
> [1] https://bugs.openjdk.org/browse/JDK-8204088
> [2] https://bugs.openjdk.org/browse/JDK-8236073
> [3] https://github.com/openjdk/jdk/pull/9955#issuecomment-1223053711
>
> -Man



More information about the hotspot-gc-dev mailing list