Configurable G1 heap expansion aggressiveness
Jaroslaw Odzga
jarek.odzga at gmail.com
Sun Feb 9 19:54:20 UTC 2025
Context and Motivation
In multi-tenant environments e.g. Kubernetes clusters in cloud
environments there is a strong incentive to use as little memory as
possible. Lower memory usage means more processes can be packed on a
single VM which directly translates to lower cloud cost.
Configuring G1 heap size in this setup is currently challenging. On
the one hand we would like to set the max heap size to a high value so
that application doesn’t fail with heap OOME when faced with
unexpectedly high load or organic growth. On the other hand we need to
set max heap size to as small a value as possible because G1 is very
eager to expand heap even when tuned to collect garbage aggressively.
Ideally, we would like to:
- Set the initial heap size to a small value.
- Set the max heap size to a value larger than expected usage so that
application can handle unexpected load and organic growth.
- Configure G1 GC to not expand heap aggressively. This is currently
not possible.
We propose two new JVM G1 flags that would give us more control over
G1 heap expansion aggressiveness and realize significant cost savings
in multi-tenant environments.
At the same time we don’t want to change existing G1 behavior - with
default values of the new flags current G1 behavior would be
maintained.
Analysis
Currently even with very aggressive G1 configuration such as:
-XX:-G1UseAdaptiveIHOP -XX:InitiatingHeapOccupancyPercent=20
-XX:GCTimeRatio=4 -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=60
the heap is fairly eagerly expanded.
We found two culprits responsible for this in
G1HeapSizingPolicy::young_collection_expansion_amount() function.
First, the scale_with_heap() function makes pause_time_threshold small
in cases where current heap size is smaller than 1/2 of max heap size.
While it is likely a desired behavior in many situations, it also
causes memory usage spikes in situations where max heap size is much
larger than current heap size.
Second, the MinOverThresholdForGrowth constant equal to 4 is an
arbitrary value which hardcodes the heap expansion aggressiveness. We
observed that short_term_pause_time_ratio can exceed
pause_time_threshold and trigger heap expansion too eagerly in many
situations, especially when allocation rate is spiky.
Proposal
We would like to introduce two new experimental flags:
- G1ScaleWithHeapPauseTimeThreshold: a binary flag that would allow
disabling scale_with_heap()
- G1MinPausesOverThresholdForGrowth: a value between 1 and 10, a
configurable replacement for the MinOverThresholdForGrowth constant.
We don’t want to change the default behavior of G1. Default values for
these flags (G1ScaleWithHeapPauseTimeThreshold=true,
G1MinPausesOverThresholdForGrowth=4) would maintain the existing
behavior.
Alternatives
There is currently no good alternative. Potentially we could configure
G1 aggressively to trigger GC very frequently e.g.:
-XX:-G1UseAdaptiveIHOP -XX:InitiatingHeapOccupancyPercent=20
-XX:GCTimeRatio=4 -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=60
Even with this configuration we see occasional large memory spikes
where heap is quickly expanded. Even though the expanded heap
contracts eventually, this poses a significant problem because in
practice we don’t know if such a spike could have been avoided so it
is not obvious how much memory the application really needs. Of course
such configuration would also consume more CPU.
Experimental results
We tested this change on patched jdk17.
With new flags we can use far less aggressive -XX:GCTimeRatio=9
together with -XX:-G1ScaleWithHeapPauseTimeThreshold and
-XX:G1MinPausesOverThresholdForGrowth=10 (this effectively disables
heap expansion based on short time pause ratio and only depends on
long time pause ratio).
Compared to more aggressive G1 configuration mentioned above we see
lower CPU usage, and 30%-60% lower max memory usage.
Implementation
https://github.com/openjdk/jdk/pull/23534
More information about the hotspot-gc-dev
mailing list