Configurable G1 heap expansion aggressiveness

Thomas Schatzl thomas.schatzl at oracle.com
Thu Feb 13 10:48:51 UTC 2025


Hi Jaroslaw,

   thank you for contributing and speaking up with an itch of yours!

The motivation, and analysis are spot on: we agree that the 
aggressiveness of G1 heap expansion paired with reluctance to give back 
memory can make it hard to configure G1 as you would want in this situation.

However we do not think that the proposed solution (adding even more 
customizability) is where we want to go.

More background below, inline:

On 09.02.25 20:54, Jaroslaw Odzga wrote:
> Context and Motivation
> In multi-tenant environments e.g. Kubernetes clusters in cloud
> environments there is a strong incentive to use as little memory as
> possible. Lower memory usage means more processes can be packed on a
> single VM which directly translates to lower cloud cost.
> Configuring G1 heap size in this setup is currently challenging. On
> the one hand we would like to set the max heap size to a high value so
> that application doesn’t fail with heap OOME when faced with
> unexpectedly high load or organic growth. On the other hand we need to
> set max heap size to as small a value as possible because G1 is very
> eager to expand heap even when tuned to collect garbage aggressively.
> 
> Ideally, we would like to:
> - Set the initial heap size to a small value.
> - Set the max heap size to a value larger than expected usage so that
> application can handle unexpected load and organic growth.
> - Configure G1 GC to not expand heap aggressively. This is currently
> not possible.
> 
> We propose two new JVM G1 flags that would give us more control over
> G1 heap expansion aggressiveness and realize significant cost savings
> in multi-tenant environments.

Understood.

We are generally very reluctant in exposing more flags in basically any 
collector due to maintenance overhead. We understand that these are 
experimental flags that can be removed at a whim, but still doing that 
if/when they are in use is awkward.


> At the same time we don’t want to change existing G1 behavior - with
> default values of the new flags current G1 behavior would be
> maintained.
> 
> Analysis
> Currently even with very aggressive G1 configuration such as:
> -XX:-G1UseAdaptiveIHOP -XX:InitiatingHeapOccupancyPercent=20
> -XX:GCTimeRatio=4 -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=60
> the heap is fairly eagerly expanded.
> 
> We found two culprits responsible for this in
> G1HeapSizingPolicy::young_collection_expansion_amount() function.
> First, the scale_with_heap() function makes pause_time_threshold small
> in cases where current heap size is smaller than 1/2 of max heap size.
> While it is likely a desired behavior in many situations, it also
> causes memory usage spikes in situations where max heap size is much
> larger than current heap size.
> Second, the MinOverThresholdForGrowth constant equal to 4 is an
> arbitrary value which hardcodes the heap expansion aggressiveness. We
> observed that short_term_pause_time_ratio can exceed
> pause_time_threshold and trigger heap expansion too eagerly in many
> situations, especially when allocation rate is spiky.
> 
> Proposal
> We would like to introduce two new experimental flags:
> - G1ScaleWithHeapPauseTimeThreshold: a binary flag that would allow
> disabling scale_with_heap()
> - G1MinPausesOverThresholdForGrowth: a value between 1 and 10, a
> configurable replacement for the MinOverThresholdForGrowth constant.
> 
> We don’t want to change the default behavior of G1. Default values for
> these flags (G1ScaleWithHeapPauseTimeThreshold=true,
> G1MinPausesOverThresholdForGrowth=4) would maintain the existing
> behavior.
> 
> Alternatives
> There is currently no good alternative. Potentially we could configure
> G1 aggressively to trigger GC very frequently e.g.:
> -XX:-G1UseAdaptiveIHOP -XX:InitiatingHeapOccupancyPercent=20
> -XX:GCTimeRatio=4 -XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=60
> Even with this configuration we see occasional large memory spikes
> where heap is quickly expanded. Even though the expanded heap
> contracts eventually, this poses a significant problem because in
> practice we don’t know if such a spike could have been avoided so it
> is not obvious how much memory the application really needs. Of course
> such configuration would also consume more CPU.

The suggestion changes

a) the aggressiveness of expansion if it has been decided that G1 should 
expand (G1ScaleWithHeapPauseTimeThreshold); looking at this particular 
piece of code, this behavior actually seems strange and unexpected. I.e. 
given that the user sets a GCTimeRatio, for some reason allow G1 to 
basically override it to a large extent.

The reason is mostly historical: I collected thoughts in 
https://bugs.openjdk.org/browse/JDK-8349978.

Note that just removing this behavior has quite a few unintended 
consequences as heap sizing is very much interconnected with general 
performance behavior.

b) makes G1 more lazy about determining whether it needs to expand 
(G1MinPausesOverThresholdForGrowth) by increasing the number of 
consecutive GCs that GCTimeRatio needs to be over the threshold to cause 
expansion.
(That's just exposing an internal constant :))


These changes cover expansion behavior, but not shrinking again. I 
believe that still the other slew of options mentioned above

(-XX:-G1UseAdaptiveIHOP -XX:InitiatingHeapOccupancyPercent=20
-XX:MinHeapFreeRatio=20 -XX:MaxHeapFreeRatio=60)

is needed to keep the heap stable and shrinking again over time (it may 
work with just changing GCTimeRatio in your particular case).

That seems awfully complicated for an end user, and indicative of 
papering over the problem. We would like to avoid this.


As Kirk in his other email in the thread indicates, there is work 
underway to make the VM (and G1) aware of other memory consumers in the 
VM. Not sure if that would also fix your problem in a more user friendly 
(and hopefully generic) way.



Wouldn't the option to make G1 to keep GCTimeRatio better (e.g. 
https://bugs.openjdk.org/browse/JDK-8238687), and/or some configurable 
soft heap size goal (https://bugs.openjdk.org/browse/JDK-8236073) that 
the collector will keep also solve your issue while being easier to 
configure?

(There're a lot of connected problems in the bug tracker, so make sure 
to follow related issues).

Maybe you are interested and can find something to work on in that area; 
there has actually already been a lot of investigation (and some 
resulting, unfinished patches) in that area, so feel free to ask.

Thanks,
   Thomas

Fwiw, we tried to label issues related to this area, see 
https://bugs.openjdk.org/issues/?jql=labels%20%3D%20gc-g1-heap-resizing .


More information about the hotspot-gc-dev mailing list