RFR: 8357445: G1: Time-Based Heap Uncommit During Idle Periods [v10]
Thomas Schatzl
tschatzl at openjdk.org
Thu Nov 20 15:08:56 UTC 2025
On Fri, 17 Oct 2025 05:19:03 GMT, Monica Beckwith <mbeckwit at openjdk.org> wrote:
>> **Implements:** https://bugs.openjdk.org/browse/JDK-8357445
>>
>> Implement time-based heap uncommit for G1 during idle periods.
>>
>> Key changes:
>> - Added G1HeapEvaluationTask for periodic heap evaluation
>> - Switch from G1ServiceTask to PeriodicTask for improved scheduling
>> - Implemented time-based heap sizing policy with configurable uncommit delay
>> - Added region activity tracking with last access timestamps
>> - Integrated VM_G1ShrinkHeap operation for safe heap shrinking
>> - Added new G1 flags: G1UseTimeBasedHeapSizing, G1TimeBasedEvaluationIntervalMillis, G1UncommitDelayMillis, G1MinRegionsToUncommit
>> - Added 'sizing' log tag for heap sizing operations
>>
>> Comprehensive Test Coverage:
>> - Enhanced TestG1RegionUncommit: minimum heap boundaries, concurrent allocation/uncommit scenarios
>> - Enhanced TestTimeBasedHeapSizing: humongous object handling, rapid allocation cycles, edge cases
>> - Enhanced TestTimeBasedRegionTracking: concurrent region access, lifecycle transition validation
>> - Enhanced TestTimeBasedHeapConfig: parameter boundary values, small heap configurations
>>
>> This ensures time-based heap uncommit works correctly while maintaining all safety guarantees and test expectations.
>
> Monica Beckwith has updated the pull request incrementally with two additional commits since the last revision:
>
> - Fix compilation errors after master merge
>
> - Fix syntax error in g1CollectedHeap.cpp (extra closing brace)
> - Update API call from short_term_pause_time_ratio() to short_term_gc_time_ratio()
>
> These fixes resolve compilation issues that occurred after merging with
> upstream master due to API changes in the OpenJDK codebase.
> - 8357445: Address feedback for G1 time-based heap sizing
>
> - Fix indentation in log_debug statement in shrink_helper (suggested by @tschatzl)
> - Change terminology from 'inactive' to 'idle' throughout time-based heap sizing
> - Update flag descriptions in g1_globals.hpp to use 'idle' terminology
> - Fix remaining trailing whitespace in test files
>
> Addresses all outstanding review comments from @tschatzl
I stopped reviewing after >40 issues (still finding some while writing) because
- some of them were unresolved from last time. Please resolve comments that have been addressed or comment why the change did not address it. Having old unresolved comments are annoying for re-reviews because they make the reviewer unsure if there is some need for further clarification or not.
- new code introduced the same issues again (e.g. use of `is_empty()`) without comment on why it can be used, after giving the reason why it can't be used.
- the main issue that the time-based uncommit still interferes with gc based sizing without restriction still applies.
- another is that the VM operation still does not re-evaluate the decisions, just uncommits a set number of regions. Possibly not even those that were found as candidates earlier afaict.
- the only improvement I remember has been that `ServiceTask` is used now
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 830:
> 828:
> 829: // Note: Region timestamps are updated automatically when regions transition to free state
> 830: // via set_free() calls, so no blanket reset is needed here
Comment sentences must have a full stop at the end. (I am only mentioning this here, from a brief look all of them miss them). See e.g. the comment right below.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1122:
> 1120: // We should only reach here from the service thread during idle time
> 1121: // but ensure any GC alloc regions are abandoned
> 1122: _allocator->abandon_gc_alloc_regions();
Maybe check that they are abandoned (i.e. there are no gc alloc regions) in an assert instead of doing work.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1175:
> 1173: // This preserves the original GC-triggered shrinking behavior
> 1174: log_debug(gc, ergo, heap)("Heap shrink requested: removing %u regions (%zuB)",
> 1175: num_regions_to_remove, shrink_bytes);
This log message duplicates the one just a few lines below. Also I would prefer to keep `Heap Resize` as "topic" of the message, and in the message explain that we shrank the heap.
I need to look at the log messages we generate as a whole, there seems to be some more repetition (one in `G1CollectedHeap::shrink()` already reads :
> "Heap resize. Requested shrink amount: %zuB aligned shrink amount: %zuB",
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1187:
> 1185: log_debug(gc, heap)("Heap shrink details: requested=%zuB actual=%zuB "
> 1186: "regions_removed=%u heap_capacity=%zuB",
> 1187: shrink_bytes, shrunk_bytes, num_regions_removed, capacity());
It would be nice to not repeat the same information at different levels; one can check the current level with `LogTarget::is_enabled()`, and make the one at the lower level a superset of the other.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1190:
> 1188: policy()->record_new_heap_size(num_committed_regions());
> 1189: } else {
> 1190: log_debug(gc, ergo, heap)("Did not shrink the heap (heap shrinking operation failed)");
Should probably have a "Heap Resize." tag in front. And later maybe fix all this by using the existing `resizing` tag.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1239:
> 1237: // Always schedule a VM operation for proper synchronization with GC
> 1238: // The VM operation will re-evaluate which regions to uncommit at the time of execution
> 1239: VM_G1ShrinkHeap op(this, shrink_bytes);
The VM operation *must* re-evaluate the decisions. There is no need to pass on this value. A GC might have been scheduled before this, and just blindly shrinking will destroy existing resizing policies.
Also, the VM operation must be aware of shutdown etc; it is probably best if `VM_ShrinkHeap` inherits from `VM_GC_Operation` which does most of this stuff already.
It should be a matter of amending `doit_prologue` that redoes the calculation to determine the actual regions to uncommit (and if they happen to be zero at that point after all, abort the VM operation by returning the appropriate return value there.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1241:
> 1239: VM_G1ShrinkHeap op(this, shrink_bytes);
> 1240: VMThread::execute(&op);
> 1241: return true; // Pages were requested to be released.
The return value is never used. Remove.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1352:
> 1350: _region_attr() {
> 1351:
> 1352: _heap_evaluation_task = nullptr;
Move into constructor initializer list.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1618:
> 1616: _heap_evaluation_task = new G1HeapEvaluationTask(this, _heap_sizing_policy);
> 1617: _service_thread->register_task(_heap_evaluation_task);
> 1618: log_debug(gc, init)("G1 Time-Based Heap Evaluation task registered and scheduled");
Unnecessary log message. The `gc, task` log message when registering contains the same information.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 1622:
> 1620: assert(_heap_evaluation_task == nullptr, "pre-condition");
> 1621: }
> 1622:
I think this one is unnecessary. The variable has just been initialized to `nullptr` in the same method.
src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 2723:
> 2721: // Note: Region timestamps are updated automatically when regions transition to free state
> 2722: // via set_free() calls, so no blanket reset is needed here
> 2723:
The change _must_ reset the timestamp otherwise it will significantly interfere with the gctimeratio/AHS based resizing. This time-based uncommit is a helper when the other does not kick in because of too infrequent gcs.
E.g. consider the case when/if a GC is scheduled (put into the VM operation queue) just before the GC. GC happens, shrinks the heap, then the shrink VM operation occurs, shrinking again by an amount determined before that GC.
src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 1012:
> 1010: // Deactivate a specific region by index.
> 1011: void deactivate_region_at(uint region_index) { _hrm.shrink_at(region_index, 1); }
> 1012:
Unused. Remove.
src/hotspot/share/gc/g1/g1HeapEvaluationTask.cpp line 46:
> 44: log_debug(gc, sizing)("Starting uncommit evaluation");
> 45:
> 46: ResourceMark rm; // Ensure temporary resources are released
Which ones are those?
src/hotspot/share/gc/g1/g1HeapEvaluationTask.cpp line 55:
> 53: SuspendibleThreadSetJoiner sts;
> 54: resize_amount = _heap_sizing_policy->evaluate_heap_resize_for_uncommit();
> 55: }
if a GC happened just before, the evaluation should be aborted. As commented in `request_heap_shrink()` method, it must re-evaluate the decision anyway, and/or maybe even abort the VM operation in the VM operation prologue if a GC has occurred somewhere inbetween.
Basically the code needs to take the `Heap_lock` to get current GC counts, passing it to the VM operation that evaluates whether it should actually run in the prologue then. I.e. something like
{
MutexLocker ml(Heap_lock);
resize_amount = ...
gc_count_before = ...
}
[...]
if (resize_amount > 0) {
VM_Shrink_Heap op(..., gc_count_before, ...);
}
src/hotspot/share/gc/g1/g1HeapEvaluationTask.cpp line 60:
> 58:
> 59: if (resize_amount > 0) {
> 60: log_info(gc, sizing)("Uncommit evaluation: shrinking heap by %zuMB using time-based selection", resize_amount / M);
At least at info level, more context is needed.
src/hotspot/share/gc/g1/g1HeapRegion.cpp line 123:
> 121: void G1HeapRegion::hr_clear(bool clear_space) {
> 122: set_top(bottom());
> 123: record_activity(); // Update timestamp when region becomes available
`hr_clear` unconditionally calls `set_free`, so this call is superfluous.
src/hotspot/share/gc/g1/g1HeapRegion.cpp line 160:
> 158: if (!is_free()) {
> 159: report_region_type_change(G1HeapRegionTraceType::Free);
> 160: record_activity(); // Record timestamp when region becomes free
Make this unconditional. At startup, the default value of the region's type is `Free`, so this will not show up if removing the call to `hr_clear()`.
src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 318:
> 316: }
> 317:
> 318:
Unnecessary newline.
src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 647:
> 645: }
> 646: }
> 647: }
Please do not re-implement existing functionality like heap region iteration. Use the `G1CollectedHeap::heap_region_iterate()` API for that which does all but the `is_free()` check for you anyway.
Also the `current_time` can/should be taken once at the beginning of the iteration.
src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 656:
> 654: // Sort regions by access time (oldest first) using simple bubble sort
> 655: // This is fine since the number of empty regions is typically small
> 656: int n = empty_regions.length();
While I do not mind the use of bubble sort (if appropriate), if performance does not matter please consider making maintenance easiest, so use the `qsort` library function.
That makes the code much more compact and reviewers do not need to review the implementation for errors. (I did not even bother reviewing it).
src/hotspot/share/gc/g1/g1HeapRegionManager.cpp line 686:
> 684: shrink_at(region_index, 1);
> 685: removed++;
> 686: }
I think I need to see this policy in action, blindly removing all too-old regions seems to be too severe an action.
These free regions are also needed for the young gen + old gen, which has been sized appropriately. It should probably at least take them into account.
src/hotspot/share/gc/g1/g1HeapRegionManager.hpp line 293:
> 291: // Used after GC operations or shrink operations that may affect region state
> 292:
> 293:
Leftover comment?
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 463:
> 461:
> 462: virtual bool do_heap_region(G1HeapRegion* r) {
> 463: if (r->is_empty() && _policy->should_uncommit_region(r)) {
Another wrong `is_empty()` here.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 499:
> 497: (*_inactive_regions)++;
> 498: // Stop early if we have enough candidates
> 499: if ((uint)_candidates->length() >= _max_candidates) {
(Fwiw, _max_candidates is always _candidates->capacity() or so)
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 535:
> 533:
> 534: // Only count if region is ready for shrinking
> 535: if (hr->is_empty() && hr->is_free()) {
`is_empty()` is wrong, already mentioned this in the last review.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 584:
> 582:
> 583: // Back off during allocation pressure - only evaluate when truly idle
> 584: if (_analytics != nullptr) {
There must always be an `_analytics` instance. Potentially there could be an assert in the constructor.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 586:
> 584: if (_analytics != nullptr) {
> 585: double gc_time_ratio = _analytics->short_term_gc_time_ratio();
> 586: if (gc_time_ratio > 0.05) { // 5% GC time still indicates pressure
gc time ratio is no indicator for idleness. Also the default is 4% right now (ie. which G1 aims for), this value would be higher than the goal. Also the time ratio is user-settable - some users may accept a higher gc usage than others.
I understand the need for this change to kind of see if there is "pressure" - however this would be implicit by gcs occurring inbetween (i.e. if the `ServiceTask` did not occur/did not find need for uncommitting regions between GCs due to time stamps too high, then there is nothing to do anyway).
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 596:
> 594: MutexLocker ml(Heap_lock);
> 595:
> 596: ResourceMark rm; // Ensure GrowableArray resources are properly released
I can't find a use of `GrowableArray` in this particular method. It may be used in a callee, but then put it there.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 598:
> 596: ResourceMark rm; // Ensure GrowableArray resources are properly released
> 597:
> 598: // Count regions eligible for uncommit (don't store them - VM operation will re-evaluate)
No, the VM operation still does not re-evaluate the decision in this code. It just directly uncommits a number of random regions.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 600:
> 598: // Count regions eligible for uncommit (don't store them - VM operation will re-evaluate)
> 599: uint idle_count = count_uncommit_candidates();
> 600: uint total_regions = _g1h->max_num_regions();
(This) `total_regions` is unused in this code (my IDE says so).
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 608:
> 606: if (idle_count >= G1MinRegionsToUncommit) {
> 607: size_t region_size = G1HeapRegion::GrainBytes;
> 608: size_t current_heap = _g1h->capacity();
Please make this name more specific, i.e. sth like `current_capacity`.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 618:
> 616: current_heap, min_heap, region_size, max_shrink_bytes, InitialHeapSize);
> 617:
> 618: if (max_shrink_bytes > 0 && region_size > 0) {
`region_size` can never be 0.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 631:
> 629:
> 630: // Total regions we must keep available = young gen + G1's standard reserve
> 631: size_t reserved_regions = young_gen_regions + g1_reserve_regions;
`G1ReservePercent` should only ever by accounted from the heap capacity. This just adds them together, need to only take into account when getting close to the maximum capacity.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 641:
> 639: // This prevents expensive re-commits during the next GC or allocation burst
> 640: // We add G1MinRegionsToUncommit as a small safety buffer beyond G1's standard reserves
> 641: size_t min_regions_after_uncommit = reserved_regions + G1MinRegionsToUncommit;
The comment would rather indicate a `MAX2` use.... the name of the variable also indicates that this is a minimum regions one would want to uncommit, not an additional buffer.
src/hotspot/share/gc/g1/g1HeapSizingPolicy.cpp line 654:
> 652: // Limited by G1MinRegionsToUncommit to avoid thrashing
> 653: size_t available_for_uncommit = idle_count;
> 654: if (available_for_uncommit < G1MinRegionsToUncommit) {
Line 606 has this condition to enter here:
if (idle_count >= G1MinRegionsToUncommit) {
and now the code checks if `idle_count is < G1MinRegionsToUncommit)`...
-------------
Changes requested by tschatzl (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/26240#pullrequestreview-3487515863
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545821604
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545800333
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545884474
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545883836
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545908305
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546084792
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546005896
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546032894
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546029717
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546035489
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546040077
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546044801
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545797272
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546112119
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545940328
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546114664
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546120282
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546147069
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546152175
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2545812518
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546164391
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546165826
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546349010
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546418145
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546363576
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546205486
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546255263
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546263408
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546438358
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546264762
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546288288
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546268711
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546276900
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546279093
PR Review Comment: https://git.openjdk.org/jdk/pull/26240#discussion_r2546329255
More information about the hotspot-dev
mailing list