Container-aware heap sizing for OpenJDK
Man Cao
manc at google.com
Thu Sep 22 09:16:12 UTC 2022
Hi all,
Great to see so many responses in this thread! I work with Jonathan and
reviewed all design and implementation aspects of AHS.
Adding a few more comments to Jonathan's response:
We were aware of the CurrentMaxHeapSize [1] and SoftMaxHeapSize [2]
proposals when we started the design of AHS. We designed AHS so that its
main logic is outside of the JVM, and the main logic only interacts with
the JVM through manageable JVM flags. In this way we could easily converge
on using CurrentMaxHeapSize / SoftMaxHeapSize for AHS once they are
implemented. Our AHS implementation has to target JDK 11 for our internal
deployment, so we have to implement our own version of "Current maximum
heap expansion size" and "Current target heap size" in JDK 11 first.
Now it seems a great time to get [1] and [2] implemented, so AHS can
actually converge. Are there recent prototypes for them? What are the main
problems in those prototypes? Jonathan and I could take a look and help get
them implemented.
For the AHS main logic that sets the manageable JVM flags, currently it
resides in a native thread started by our own Java launcher. [3] has a
description on what our launcher does.
However, the main logic could be implemented in other places, such as a
Java agent, a JNI library, or another process. We could do some rework to
make the main logic a JNI library, then open-source it.
We also agree with Thomas that the AHS main logic is not suitable to be
integrated into the JVM, as it varies greatly depending on the deployment
environment (e.g., the container technology, OS, additional external
signals that should be taken into account).
Besides [1] and [2], another required JVM feature is monitoring metrics for
GC CPU time. We added hsperfdata counters in our JDK 11 for this, and I
will create RFEs to upstream them. After all, it seems unlikely that AHS
needs another JEP.
@Thomas
> SPECjbb2015 is a standard industry Java VM performance benchmark
> (https://www.spec.org/jbb2015/). Unfortunately it's commercial.
> ...
> b) and during measurement phase, SPECjbb2015 has a peculiar way of
> trying to find maximum sustainable load.
> ...
Thanks for the details on specjbb2015!
We did try SPECjbb2015 for benchmarking earlier (Jonathan was not involved
at that time), but did not pursue running it regularly as we did not find
it representative enough to our typical server workload.
The peculiar way of finding maximum load (sudden allocation spike after a
quiescence period) does sound problematic for most GC heuristics. It does
NOT sound like an important case to optimize in the real world. Real-world
server workload typically ramps up and ramps down gradually, corresponding
to daily traffic patterns. Batch and desktop applications are more likely
to have phased behavior, but such phase changes should only last for a
short, temporary period.
@Kirk for tuning young-gen vs old-gen sizing with AHS:
I agree that manually tuning young-gen and old-gen sizing would give the
applications the optimal GC performance, i.e. low GC CPU overhead.
However, the goal of AHS is not to optimize GC CPU or pause performance.
AHS's goal is to make JVM's memory behavior play nice in a containerized
environment, at a small cost of GC CPU or pause overhead. So I'm not sure
if it is a good idea or feasible to make AHS configure young/old sizing as
well. It would require making JVM flags such as MaxGCPauseMillis, NewSize,
MaxNewSize, InitiatingHeapOccupancyPercent manageable as well, which may be
challenging by themselves.
[1] https://bugs.openjdk.org/browse/JDK-8204088
[2] https://bugs.openjdk.org/browse/JDK-8236073
[3] https://github.com/openjdk/jdk/pull/9955#issuecomment-1223053711
-Man
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20220922/808adbc7/attachment.htm>
More information about the hotspot-gc-dev
mailing list