RFR: 8342848: Shenandoah: Marking bitmap may not be completely cleared in generational mode [v4]
duke
duke at openjdk.org
Wed Oct 23 21:34:32 UTC 2024
On Wed, 23 Oct 2024 20:28:34 GMT, Xiaolong Peng <xpeng at openjdk.org> wrote:
>> In the investigation of the crashe I saw in PR https://github.com/openjdk/shenandoah/pull/516, I happened to reproduce the crash GenShen TIP as well.
>>
>> The crash was reproduced multi times on both AWS r7g-4xlarge and r7i-4xlarge instances by running test below repeatedly:
>>
>>
>> CONF=linux-aarch64-server-fastdebug make clean test TEST=gc/stress/gcold/TestGCOldWithShenandoah.java#generational JTREG="REPEAT_COUNT=1000"
>> ```
>>
>> Crash:
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> # Internal Error (/home/xlpeng/repos/jdk-xlpeng/src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp:642), pid=24134, tid=24158
>> # assert(_generation->is_bitmap_clear()) failed: need clear marking bitmap
>> #
>> # JRE version: OpenJDK Runtime Environment (24.0) (fastdebug build 24-internal-adhoc.xlpeng.jdk-xlpeng)
>> # Java VM: OpenJDK 64-Bit Server VM (fastdebug 24-internal-adhoc.xlpeng.jdk-xlpeng, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, shenandoah gc, linux-aarch64)
>> # Problematic frame:
>> # V [libjvm.so+0x15eadc4] ShenandoahConcurrentGC::op_init_mark()+0x358
>> #
>> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
>> #
>> # An error report file with more information is saved as:
>> # /local/home/xlpeng/repos/jdk-xlpeng/build/linux-aarch64-server-fastdebug/test-support/jtreg_test_hotspot_jtreg_gc_stress_gcold_TestGCOldWithShenandoah_java_generational/scratch/0/hs_err_pid24134.log
>> #
>> # If you would like to submit a bug report, please visit:
>> # https://bugreport.java.com/bugreport/crash.jsp
>> #
>>
>>
>> With logging/instrumentation, it seems to be caused by the one line code `bool needs_reset = _generation->contains(region) || !region->is_affiliated(); `, considering bitmap reset is a concurrent operation, if is possible mutator thread changes the affiliation from FREE to YOUNG when bitmap reset is running, both `_generation->contains(region)` and `!region->is_affiliated()` can be false when affiliation is FREE and mutator is updating it at the same time.
>>
>> Logs from instrumentation:
>>
>> [32.793s][info][gc ] GC(19) Not reseting bitmap for YOUNG region (0x0000ffff8c1a6100)(affiliation before test: FREE)
>>
>> ...
>>
>> [32.807s][info][gc,task ] GC(20) Using 8 of 8 workers for init marking
>> [32.808s][info][gc ] GC(20) Region (0x0000ffff8c1a6100) doesn't have clear bitmap, [1, 1, 1]
> ...
>
> Xiaolong Peng has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix build error
@pengxiaolong
Your change (at version 302bf4f021ef5d5b0aa39009168aecf3a2be9f2d) is now ready to be sponsored by a Committer.
-------------
PR Comment: https://git.openjdk.org/shenandoah/pull/523#issuecomment-2433500910
More information about the shenandoah-dev
mailing list