RFR: 8338977: Parallel: Improve heap resizing heuristics [v2]

Fri May 16 08:36:25 UTC 2025

On Sun, 11 May 2025 14:24:43 GMT, Guoxiong Li <gli at openjdk.org> wrote:

>> Albert Mingkun Yang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>> 
>>  - review
>>  - Merge branch 'master' into pgc-size-policy
>>  - pgc-size-policy
>
> src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 343:
> 
>> 341:       if (is_gc_overhead_limit_reached()) {
>> 342:         return nullptr;
>> 343:       }
> 
> It seems the parameter `gc_overhead_limit_was_exceeded` and the field `MemAllocator::Allocation::_overhead_limit_exceeded` are not used by all GCs now. Should we keep the parameter and set it as `true` under the condition `is_gc_overhead_limit_reached()`? For example:
> 
> 
>       if (op.prologue_succeeded()) {
>         assert(is_in_or_null(op.result()), "result not in heap");
>         if (is_gc_overhead_limit_reached()) {
>           *gc_overhead_limit_was_exceeded = true;
>           return nullptr;
>         }
>         return op.result();
>       }
> 
> 
> Or we should remove the parameter and the field in another PR.

Since this is not implemented by any other GCs, I think it's best to remove it in a follow-up PR.

> src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 825:
> 
>> 823:   // If MinHeapFreeRatio is at its default value; shrink cautiously. Otherwise, users expect prompt shrinking.
>> 824:   if (FLAG_IS_DEFAULT(MinHeapFreeRatio) && MinHeapFreeRatio == 0) {
>> 825:     if (desired_capacity < current_capacity) {
> 
> I think curiously a lot about the condition `MinHeapFreeRatio == 0` and then I find the following code in `parallelArguments.cpp`. May it be better to use `UseAdaptiveSizePolicy && FLAG_IS_DEFAULT(MinHeapFreeRatio)` here instead of `FLAG_IS_DEFAULT(MinHeapFreeRatio) && MinHeapFreeRatio == 0`?
> 
> 
>   if (UseAdaptiveSizePolicy) {
>     // We don't want to limit adaptive heap sizing's freedom to adjust the heap
>     // unless the user actually sets these flags.
>     if (FLAG_IS_DEFAULT(MinHeapFreeRatio)) {
>       FLAG_SET_DEFAULT(MinHeapFreeRatio, 0);
>     }
>     if (FLAG_IS_DEFAULT(MaxHeapFreeRatio)) {
>       FLAG_SET_DEFAULT(MaxHeapFreeRatio, 100);
>     }
>   }

This method is invoked only when `UseAdaptiveSizePolicy == true`. Removed `MinHeapFreeRatio == 0`.

> src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 862:
> 
>> 860:   resize_old_gen_after_full_gc();
>> 861:   young_gen()->resize_after_full_gc();
>> 862: }
> 
> The `PSYoungGen` has its methods `resize_after_full_gc` and `resize_after_young_gc`. I think such design is good. What about moving the method `resize_old_gen_after_full_gc` (and the related method `calculate_desired_old_gen_capacity`) to `PSOldGen` and renaming it as `resize_after_full_gc`?

`resize_old_gen_after_full_gc` contains some logic around `MinHeapFreeRatio` that makes it unsuitable to be placed inside old-gen, IMO. Given there is on-going work/discussion on removing/limiting MinHeapFreeRatio in https://bugs.openjdk.org/browse/JDK-8353716 in G1, I think we don't need to optimize for the structure too much, as it will probably be changed soon.

> src/hotspot/share/gc/parallel/psAdaptiveSizePolicy.cpp line 45:
> 
>> 43:      _avg_promoted(new AdaptivePaddedNoZeroDevAverage(AdaptiveSizePolicyWeight, PromotedPadding)),
>> 44:      _space_alignment(space_alignment),
>> 45:      _young_gen_size_increment_supplement(YoungGenerationSizeSupplement) {}
> 
> Typos in `gc_globals.hpp`(shown below): `YoungedGenerationSizeIncrement` and `YoungedGenerationSizeSupplement`. It should be fixed in another PR.
> 
>   product(uint, YoungGenerationSizeIncrement, 20,                           \
>           "Adaptive size percentage change in young generation")            \
>           range(0, 100)                                                     \
>                                                                             \
>   product(uint, YoungGenerationSizeSupplement, 80,                          \
>           "Supplement to YoungedGenerationSizeIncrement used at startup")   \ // <--- here
>           range(0, 100)                                                     \
>                                                                             \
>   product(uintx, YoungGenerationSizeSupplementDecay, 8,                     \
>           "Decay factor to YoungedGenerationSizeSupplement")                \ // <--- here
>           range(1, max_uintx)                                               \

Filed: https://bugs.openjdk.org/browse/JDK-8357109

> src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1104:
> 
>> 1102:     heap->post_full_gc_dump(&_gc_timer);
>> 1103: 
>> 1104:     size_policy->record_gc_pause_end_instant();
> 
> What about moving this invocation into `major_collection_end`? Just like the `record_gc_pause_start_instant` and `major_collection_begin`.

This method should be called at the end of gc-pause to better reflect the actual mutator-running/paused time. OTOH, we also adaptive-resizing using gc-pause-time, so there is a circular dependency. Therefore, I invoke `major_collection_end` before adaptive-resizing as a compromise. This issue is more evident for young-gc, as young-gc is usually much shorter; see the comment next to `size_policy->minor_collection_end`.

> src/hotspot/share/gc/shared/adaptiveSizePolicy.hpp line 179:
> 
>> 177:     _gc_distance_timer.reset();
>> 178:     _gc_distance_timer.start();
>> 179:   }
> 
> The method name `record_gc_pause_end_instant` is about `gc pause`, but the code is about `gc distance`. May we need a clearer name?

In stw-gc, when a gc-pause ends, mutators start running, so the distance btw two consecutive gc-pauses start to tick. This looks quite clear to me, but I am ofc biased. What names do you suggest?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2092493546
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2092594791
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2092589932
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2092502996
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2092520272
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2092526857