RFR: JDK-8262068: Improve G1 Full GC by skipping compaction for regions with high survival ratio [v12]

Hamlin Li mli at openjdk.java.net
Tue Mar 30 12:46:22 UTC 2021


On Tue, 30 Mar 2021 09:58:29 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

>>> Thanks Thomas, sure, will hold until your test finished.
>>> At the same time, we are also run perf tests to make sure the good performance of this final version.
>> 
>> gc/g1/TestEagerReclaimHumongousRegionsClearMarkBits.java fails with a fairly unknown error every few runs with this change:
>> 
>>     [26.103s][info][gc,start] GC(112) Pause Full (G1 Evacuation Pause)
>>     # To suppress the following error report, specify this argument
>>     # after -XX: or in .hotspotrc:  SuppressErrorAt=/g1BlockOffsetTable.cpp:358
>>     #
>>     # A fatal error has been detected by the Java Runtime Environment:
>>     #
>>     #  Internal Error (.../src/hotspot/share/gc/g1/g1BlockOffsetTable.cpp:358), pid=74345, tid=74370
>>     #  guarantee(backskip <= max_backskip) failed: Going backwards beyond the start_card. start_card: 225280 current_card: 225281 backskip: 256
>> 
>> There is a strong likelihood that this is a pre-existing issue and not directly caused by this change. I will need to investigate this.
>
> The reason for this crash is that if there is a young region that is not compacted (because it's mostly full), its BOT (block offset table) is not updated to the extent the verification expects it.
> 
> I.e. that verification expects the BOT for old gen regions is completely valid, from start to end of the region. For young regions this is not the case, their BOT is not updated at all (which is normal), just containing a marker indicating that there is no BOT; compaction would take care of this.
> 
> The verification does not respect this "after this point the BOT is invalid" marker. There is actually some code in `heapRegion.cpp:724` that intentionally skips young region verification for this purpose. Now that they might be old, this does not work. I think the correct fix is to not try to verify beyond this marker, i.e. change the assignment to `end_card` in `G1BlockOffsetTablePart::verify` accordingly.
> 
> Another solution would be to make the region appear like filled with a single large allocation (TLAB).
> 
> This is purely to keep the verification happy. All other code correctly handles a partially valid BOT (i.e. valid up to and including that mentioned marker).
> 
> I will try out these options.

Hi Thomas, Thank you so much for helping investigate the issue, I will check it tomorrow too.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2760



More information about the hotspot-gc-dev mailing list