RFR: 8351464: Shenandoah: Hang on ShenandoahController::handle_alloc_failure when run test TestAllocHumongousFragment#generational [v4]
William Kemper
wkemper at openjdk.org
Wed Mar 12 16:42:59 UTC 2025
On Wed, 12 Mar 2025 00:24:46 GMT, Xiaolong Peng <xpeng at openjdk.org> wrote:
>> William Kemper has updated the pull request incrementally with one additional commit since the last revision:
>>
>> The non-generational modes may also fail to notify waiters
>
> src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 171:
>
>> 169:
>> 170: // If this cycle completed without being cancelled, notify waiters about it
>> 171: if (!heap->cancelled_gc()) {
>
> I feel we should remove the test `!heap->cancelled_gc()` here, if is fine if there is single mutator thread, but in most cases there are mutator threads, then the following case could happen:
> 1. **Mutator A** try to cancel GC and notify control thread, it will wait with `_alloc_failure_waiters_lock`, `_cancelled_cause` is set to `_allocation_failure`
> 2. Concurrent GC clear `_cancelled_cause` and set it to `_no_gc` in op_final_update_refs
> 3. **Mutator B** try to cancel GC and successfully set `_cancelled_cause` to `_allocation_failure` again.
> 4. Concurrent GC finishes.
> 5. Control thread check `!heap->cancelled_gc()` which is false, and won't wake up mutators.
>
> In this case, it will delay the wake up for mutator A & B to next cycle.
I believe that is the correct behavior. The mutators are waiting until there is memory available. If mutator B cannot allocate, there is no reason to believe mutator A would be able to allocate. In this case, it is fine for both mutators to wait (even if it means A has to wait an extra cycle).
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/23997#discussion_r1991893985
More information about the shenandoah-dev
mailing list