RFR: 8338977: Parallel: Improve heap resizing heuristics
Guoxiong Li
gli at openjdk.org
Sun May 11 19:33:58 UTC 2025
On Fri, 2 May 2025 10:23:25 GMT, Albert Mingkun Yang <ayang at openjdk.org> wrote:
> This patch refines Parallel's sizing strategy to improve overall memory management and performance.
>
> The young generation layout has been reconfigured from the previous `eden-from/to` arrangement to a new `from/to-eden` order. This new layout facilitates young generation resizing, since we perform resizing after a successful young GC when all live objects are located at the beginning of the young generation. Previously, resizing was often inhibited by live objects residing in the middle of the young generation (from-space). The new layout is illustrated in `parallelScavengeHeap.hpp`.
>
> `NumberSeq` is now used to track various runtime metrics, such as minor/major GC pause durations, promoted/survived bytes after a young GC, highest old generation usage, etc. This tracking primarily lives in `AdaptiveSizePolicy` and its subclass `PSAdaptiveSizePolicy`.
>
> GC overhead checking, which was previously entangled with adaptive resizing logic, has been extracted and is now largely encapsulated in `ParallelScavengeHeap::is_gc_overhead_limit_reached`.
>
> ## Performance evaluation
>
> - SPECjvm2008-Compress shows ~8% improvement on Linux/AArch64 and Linux/x64 (restoring the regression reported in [JDK-8332485](https://bugs.openjdk.org/browse/JDK-8332485) and [JDK-8338689](https://bugs.openjdk.org/browse/JDK-8338689)).
> - Fixes the surprising behavior when using a non-default (smaller) value of `GCTimeRatio` with Heapothesys/Hyperalloc, as discussed in [this thread](https://mail.openjdk.org/pipermail/hotspot-gc-dev/2024-November/050146.html).
> - Performance is mostly neutral across other tested benchmarks: **DaCapo**, **SPECjbb2005**, **SPECjbb2015**, **SPECjvm2008**, and **CacheStress**. The number of young-gc sometimes goes up a bit and the total heap-size decreases a bit, because promotion-size-to-old-gen goes down with the more effective eden/survivor-space resizing.
>
> PS: I have opportunistically set the obsolete/expired version to 25/26 for now. I will update them accordingly before merging.
>
> Test: tier1-8
I review about 1/3 code now. But I want to save the thoughts, so I submit it. Sorry for the noise if it bothers you.
src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 343:
> 341: if (is_gc_overhead_limit_reached()) {
> 342: return nullptr;
> 343: }
It seems the parameter `gc_overhead_limit_was_exceeded` and the field `MemAllocator::Allocation::_overhead_limit_exceeded` are not used by all GCs now. Should we keep the parameter and set it as `true` under the condition `is_gc_overhead_limit_reached()`? For example:
if (op.prologue_succeeded()) {
assert(is_in_or_null(op.result()), "result not in heap");
if (is_gc_overhead_limit_reached()) {
*gc_overhead_limit_was_exceeded = true;
return nullptr;
}
return op.result();
}
Or we should remove the parameter and the field in another PR.
src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 825:
> 823: // If MinHeapFreeRatio is at its default value; shrink cautiously. Otherwise, users expect prompt shrinking.
> 824: if (FLAG_IS_DEFAULT(MinHeapFreeRatio) && MinHeapFreeRatio == 0) {
> 825: if (desired_capacity < current_capacity) {
I think curiously a lot about the condition `MinHeapFreeRatio == 0` and then I find the following code in `parallelArguments.cpp`. May it be better to use `UseAdaptiveSizePolicy && FLAG_IS_DEFAULT(MinHeapFreeRatio)` here instead of `FLAG_IS_DEFAULT(MinHeapFreeRatio) && MinHeapFreeRatio == 0`?
if (UseAdaptiveSizePolicy) {
// We don't want to limit adaptive heap sizing's freedom to adjust the heap
// unless the user actually sets these flags.
if (FLAG_IS_DEFAULT(MinHeapFreeRatio)) {
FLAG_SET_DEFAULT(MinHeapFreeRatio, 0);
}
if (FLAG_IS_DEFAULT(MaxHeapFreeRatio)) {
FLAG_SET_DEFAULT(MaxHeapFreeRatio, 100);
}
}
src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 862:
> 860: resize_old_gen_after_full_gc();
> 861: young_gen()->resize_after_full_gc();
> 862: }
The `PSYoungGen` has its methods `resize_after_full_gc` and `resize_after_young_gc`. I think such design is good. What about moving the method `resize_old_gen_after_full_gc` (and the related method `calculate_desired_old_gen_capacity`) to `PSOldGen` and renaming it as `resize_after_full_gc`?
src/hotspot/share/gc/parallel/parallelScavengeHeap.hpp line 141:
> 139: // Invoked at gc-pause-end
> 140: void gc_epilogue(bool full);
> 141:
It is strange that Parallel GC didn't have its prologue and epilogue before. And currently, the concrete work categories (such as increasing the GC count) of the prologue and epilogue in all the GCs are not unified. It seems an issue left over by history, so it need more investigation in the future.
src/hotspot/share/gc/parallel/psAdaptiveSizePolicy.cpp line 45:
> 43: _avg_promoted(new AdaptivePaddedNoZeroDevAverage(AdaptiveSizePolicyWeight, PromotedPadding)),
> 44: _space_alignment(space_alignment),
> 45: _young_gen_size_increment_supplement(YoungGenerationSizeSupplement) {}
Typos in `gc_globals.hpp`(shown below): `YoungedGenerationSizeIncrement` and `YoungedGenerationSizeSupplement`. It should be fixed in another PR.
product(uint, YoungGenerationSizeIncrement, 20, \
"Adaptive size percentage change in young generation") \
range(0, 100) \
\
product(uint, YoungGenerationSizeSupplement, 80, \
"Supplement to YoungedGenerationSizeIncrement used at startup") \ // <--- here
range(0, 100) \
\
product(uintx, YoungGenerationSizeSupplementDecay, 8, \
"Decay factor to YoungedGenerationSizeSupplement") \ // <--- here
range(1, max_uintx) \
src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1104:
> 1102: heap->post_full_gc_dump(&_gc_timer);
> 1103:
> 1104: size_policy->record_gc_pause_end_instant();
What about moving this invocation into `major_collection_end`? Just like the `record_gc_pause_start_instant` and `major_collection_begin`.
src/hotspot/share/gc/shared/adaptiveSizePolicy.hpp line 179:
> 177: _gc_distance_timer.reset();
> 178: _gc_distance_timer.start();
> 179: }
The method name `record_gc_pause_end_instant` is about `gc pause`, but the code is about `gc distance`. May we need a clearer name?
src/hotspot/share/gc/shared/adaptiveSizePolicy.hpp line 184:
> 182: _gc_distance_timer.stop();
> 183: _gc_distance_seconds_seq.add(_gc_distance_timer.seconds());
> 184: }
The method name `record_gc_pause_start_instant` is about `gc pause`, but the code is about `gc distance`. May we need a clearer name?
-------------
Changes requested by gli (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/25000#pullrequestreview-2831414868
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2083540517
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2083573645
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2083574866
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2083578247
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2083595212
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2083596481
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2083582694
PR Review Comment: https://git.openjdk.org/jdk/pull/25000#discussion_r2083581870
More information about the hotspot-dev
mailing list