RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v11]

Thomas Schatzl tschatzl at openjdk.java.net
Fri Mar 26 08:35:31 UTC 2021


On Thu, 25 Mar 2021 12:28:58 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Summary
>> -----------
>> 
>> Improve G1 Full GC by skip compaction for regions with high survival ratio.
>> 
>> Backgroud
>> -----------
>> 
>> There are 4 steps in full gc of G1 GC.
>> - mark live objects
>> - prepare forwardee
>> - adjust pointers
>> - compact
>> 
>> When full gc occurs, there may be very high percentage of live bytes in some regions. For these regions, it's not efficient to compact them and better to skip them, as there are little space to save but many objects to copy.
>> 
>> Description
>> -----------
>> 
>> We enhance the full gc implementation for the above situation through following steps:
>> - accumulate live bytes of every hr in mark phase; (already done by JDK-8263495)
>> - skip adding regions with high survial ratio, and set the region with high survival ratio as pinned in _region_attr_table during prepare phase;
>> - nothing special is done in adjust phase, regions with high survial ratio are skipped because of pin setting in the above step;
>> - nothing special is done in compact phase, regions with high survival ratio are skipped because these regions are skipped when adding regions to compaction set in the prepare phase;
>> 
>> VM options related
>> -----------
>> 
>> - MarkSweepDeadRatio: we reuse this exising vm option to indicate the high survial ratio threhold (100-MarkSweepDeadRatio) in G1.
>>      - default value of MarkSweepDeadRatio: 5 
>> 
>> Test
>> -----------
>> 
>> - specjbb2015: no regression
>> - dacapo: (Attachment is the dacapo h2 full gc pause.)
>>      - 95% of full gc pauses: 10%-19% improvement. 
>>      - 5% of full gc pauses: 1.2% improvement.
>>      - 0.1% of full gc pauses: -6.16% improvement.
>> 
>> $ java -Xmx1g -Xms1g -XX:ParallelGCThreads=4 -Xlog:gc*=info:file=gc.log -jar dacapo-9.12-bach.jar --iterations 5 --size huge --no-pre-iteration-gc h2
>
> Hamlin Li has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:
> 
>  - Merge branch 'master' into g1-full-gc-optimization-00
>  - reuse the "pin" mechanism (G1FullCollector._region_attr_table) to skip region compaction;
>    deal with last-ditch full gc;
>  - Merge branch 'master' into g1-full-gc-optimization-00
>  - fix bot crash.
>  - fix crash in G1CalculatePointersClosure::prepare_for_skipping_compaction when klass of dead objects is unloaded;
>    other misc improvements.
>  - reuse vm option MarkSweepDeadRatio; reuse G1RegionMarkStatsCache class; fix regression in Mark phase by inlining live words collection into mark_object()
>  - JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio

Looks pretty good actually.

Looks pretty good actually.

src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 502:

> 500:   // - if clear_all_soft_refs is true, all soft references should be
> 501:   //   cleared during the GC.
> 502:   // - if do_maximal_compaction is true, full gc will do maximal compaction.

..., full gc will do a maximally compacting collection, leaving no dead wood.

src/hotspot/share/gc/g1/g1FullCollector.cpp line 233:

> 231: void G1FullCollector::update_attribute_table(HeapRegion* hr, bool force_pinned) {
> 232:   if (force_pinned) {
> 233:     // Pin high live ratio region

I think the comment is too specific as the argument name is `force_pinned` and not `is_high_live_ratio_region`. Please remove.

src/hotspot/share/gc/g1/g1FullCollector.cpp line 237:

> 235:     return;
> 236:   }
> 237: 

Since the `if ... else if ...` cascade has already been started below, please continue with it.

src/hotspot/share/gc/g1/g1FullCollector.cpp line 244:

> 242:   } else {
> 243:     // Update _region_attr_table after free pinned regions,
> 244:     // as the region can not be accessed in G1ResetPinnedClosure.

Not sure what the comment wants to indicate. What does "after free pinned regions" mean here?
If we need a comment I think something like "Everything else is processed normally." should be fine.
But maybe this comment wants to make aware of something special, it's just I do not understand what it would mean.

src/hotspot/share/gc/g1/g1FullCollector.hpp line 72:

> 70:   ReferenceProcessorIsAliveMutator _is_alive_mutator;
> 71:   G1RegionMarkStats*        _live_stats;
> 72:   static uint calc_active_workers();

Please keep the newline between members and methods.

src/hotspot/share/gc/g1/g1FullCollector.hpp line 100:

> 98:   G1CMBitMap*              mark_bitmap();
> 99:   ReferenceProcessor*      reference_processor();
> 100:   size_t                   hr_live_words(uint hr_index) { return _live_stats[hr_index]._live_words; }

I think just `live_words` for the method name is fine, and `region_index` is the commonly used name for region indexes if you look at other code.
(Please also fix elsewhere; and in the last patch some `hr_index` also crept in into `g1RegionMarkStatsCache.cpp`).

src/hotspot/share/gc/g1/g1FullGCCompactTask.cpp line 48:

> 46:     // In the prepare phase, we "pin" the regions with high survival ratio
> 47:     // by _region_attr_table, so here we use _region_attr_table rather than
> 48:     // HeapRegion itself to tell whether a region is pinned.

This comment is going to be completely incomprehensible after a week. While it points out a particular case of pinned regions, this comment is misplaced and immediately begs the question about other cases.

Maybe in the `G1FullGCHeapRegionAttr` class documentation, point out that in particular, that data structure collects whether a region should be considered pinned during full gc (only), and that there are two reasons a region is pinned (and excluded from compaction):
- the HeapRegion itself has been pinned at the start of Full GC
- the occupancy of the region is too high to be considered eligible for compaction

src/hotspot/share/gc/g1/g1FullGCCompactTask.cpp line 49:

> 47:     // by _region_attr_table, so here we use _region_attr_table rather than
> 48:     // HeapRegion itself to tell whether a region is pinned.
> 49:     if (!_collector->is_in_pinned_or_closed(hr_index)) {

The original code uses only `r->is_pinned()`, not `r->is_pinned_or_closed()`.
This predicate is wrong, `closed` regions may not be pinned (but can be, and in most cases are).

src/hotspot/share/gc/g1/g1FullGCScope.hpp line 56:

> 54:   G1MonitoringScope       _monitoring_scope;
> 55:   G1HeapTransition        _heap_transition;
> 56:   size_t                  _hr_live_words_threshold;

Please adapt according to the new name of the getter.

src/hotspot/share/gc/g1/g1_globals.hpp line 306:

> 304:           "disables this check.")                                           \
> 305:           range(0.0, (double)max_uintx)
> 306: // end of GC_G1_FLAGS

That newline removal seems unnecessary.

src/hotspot/share/gc/g1/heapRegion.inline.hpp line 201:

> 199: }
> 200: 
> 201: inline void HeapRegion::reset_pinned_after_full_gc() {

Maybe rename the method since it now does not only apply to "pinned" regions, but any region that is not compacted.
E.g. `reset_not_compacted_after_full_gc()`.

-------------

Marked as reviewed by tschatzl (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/2760Changes requested by tschatzl (Reviewer).



More information about the hotspot-gc-dev mailing list