RFR: 8238687: Investigate memory uncommit during young collections in G1 [v4]

Thomas Schatzl tschatzl at openjdk.org
Mon Jun 23 10:13:35 UTC 2025


On Thu, 19 Jun 2025 12:26:15 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:

>> Hi all,
>> 
>> Please review this change to the G1 heap resizing policy, aimed at improving alignment with the configured GCTimeRatio. The GCTimeRatio is intended to manage the balance between GC time and Application execution time. G1's current implementation of GCTimeRatio appears to have drifted from its intended purpose over time.  Therefore, we need to change G1’s use of the GCTimeRatio to better manage heap sizes without relying on additional magic constants.
>> 
>> The primary goal is to enable both heap expansion and shrinking at the end of any GC, rather than limiting shrinking to only the Remark or Full GC pauses as is currently done. We achieve this using heuristics that monitor both short-term and long-term GC time ratios relative to the configured GCTimeRatio.
>> 
>> - The short-term policy adjusts a counter based on whether recent GC time is above or below a target range around GCTimeRatio (as defined by G1MinimumPercentOfGCTimeRatio). When the counter crosses predefined thresholds, the heap may be expanded or shrunk accordingly.
>> 
>> - The long-term policy evaluates the GC time ratio over a long-term interval and triggers resizing if the number of recorded ratios exceeds a threshold and the GC time ratio over the long-term interval is outside the target range.
>> 
>> - These heuristics allow for responsive heap resizing (both expansion and shrinking) at the end of any GC, guided by actual GC performance rather than fixed thresholds or constants.
>> 
>> We are increasing the default GCTimeRatio from 12 to 24, since under the new policy, the current default leads to overly aggressive heap shrinking as the GCTimeRatio allows for a lot more GC overhead.
>> 
>> Additionally, we are removing the heap resizing step at the end of the Remark pause which was based on MinHeapFreeRatio and MaxHeapFreeRatio. We keep this MinHeapFreeRatio-MaxHeapFreeRatio based resizing logic at the end of Full GC and Remark pauses that may have been triggered by PeriodicGCs.
>> 
>> As a result of these changes, some applications may settle at more appropriate and in some cases smaller heap sizes for the configured GCTimeRatio. While this may appear as a regression in some benchmarks that are sensitive to heap size, it represents more accurate G1 behavior with respect to the GCTimeRatio. Although smaller heap sizes may lead to more frequent GCs, this is the expected outcome, provided the cumulative GC overhead remains within the limits defined by the GCTimeRatio.
>> 
>> Testing: Mach5 ...
>
> Ivan Walulya has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Albert suggestions

Some minor nits I think.

src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 61:

> 59: 
> 60:   assert(G1ShortTermShrinkThreshold <= long_term_count_limit(),
> 61:          "Shrink threshold count must be less than %u", long_term_count_limit());

I would prefer if these would be part of argument processing.
I see that below we just ignore what the user specified for `G1ShortTermShrinkThreshold` if it is too high when determining `ThresholdForShrink`, but I would prefer to just fail early if the diagnostic option is out of bounds.

src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 225:

> 223:   // - lower threshold, we do not want to go under.
> 224:   // - mid threshold, halfway between upper and lower threshold, represents the
> 225:   // actual target when resizing the heap.

Suggestion:

  // - actual pause time threshold, halfway between upper and lower threshold, represents the
  // actual target when resizing the heap.


("mid-threshold" is some renmant of the original change)

src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 335:

> 333:     // A resize has not been triggered, but the long term counter overflowed.
> 334:     decay_ratio_tracking_data();
> 335:     expand = true; // Does not matter.

Maybe we should return `expand = false` in the cases where `resize_bytes == 0` always? maybe in the log printing below we should suppress the `expand: <bool>` part if `resize_bytes == 0` too.

src/hotspot/share/gc/g1/g1_globals.hpp line 162:

> 160:   product(uint, G1ExpandByPercentOfAvailable, 20, EXPERIMENTAL,             \
> 161:           "When expanding, % of uncommitted space to claim.")               \
> 162:           range(0, 100)                                                     \

Suggestion:

          "When expanding, % of uncommitted space to expand the heap by in a single expand attempt.")               \
          range(0, 100)                                                     \

src/hotspot/share/gc/g1/g1_globals.hpp line 166:

> 164:   product(uint, G1ShrinkByPercentOfAvailable, 50, EXPERIMENTAL,           \
> 165:           "When shrinking, maximum % of free space to claim.")              \
> 166:           range(0, 100)                                                     \

Suggestion:

          "When shrinking, maximum % of free space to free for a single shrink attempt.")              \
          range(0, 100)                                                     \

src/hotspot/share/gc/g1/g1_globals.hpp line 169:

> 167:                                                                             \
> 168:   product(uint, G1MinimumPercentOfGCTimeRatio, 25, EXPERIMENTAL,          \
> 169:           "Percentage of GCTimeRatio G1 will try to avoid going below.")    \

Suggestion:

          "Determines lower and upper thresholds as percentage of GCTimeRatio. G1 compares these thresholds against the current gc cpu usage (gc time ratio?) to register too low or too high cpu usage events for heap resizing.")    \

src/hotspot/share/gc/g1/g1_globals.hpp line 174:

> 172:   product(uint, G1ShortTermShrinkThreshold, 8, EXPERIMENTAL,                \
> 173:           "Number of consecutive GCs with the short term gc time ratio"     \
> 174:           "below the threshold before we attempt to shrink.")               \

I think the description is somewhat confusing.

Suggestion:

          "Number of consecutive GCs where the current gc time ratio"     \
          "below the lower threshold before G1 attempts to shrink.")               \

src/hotspot/share/gc/g1/g1_globals.hpp line 176:

> 174:           "below the threshold before we attempt to shrink.")               \
> 175:           range(0, 10)                                                      \
> 176:                                                                             \

I would make them diagnostic, not experimental. They might need to be tweaked, and it's better to provide them as diagnostic than experimental. Just sounds more "safe" to use for end users when providing them. It's not like they trigger potentially unstable features.

Also, for symmetry I think we should provide a `G1ShortTermExpandThreshold` as well instead of the constant `MinOverThresholdForExpansion` embedded in the code.

-------------

Changes requested by tschatzl (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25832#pullrequestreview-2949374369
PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161197065
PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161162327
PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161208940
PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161191299
PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161188405
PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161184827
PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161187759
PR Review Comment: https://git.openjdk.org/jdk/pull/25832#discussion_r2161179316


More information about the hotspot-dev mailing list