RFR: 8373100: Genshen: Control thread can miss allocation failure notification [v2]

William Kemper wkemper at openjdk.org
Thu Dec 11 16:05:51 UTC 2025


On Thu, 11 Dec 2025 00:12:35 GMT, Xiaolong Peng <xpeng at openjdk.org> wrote:

>> I was looking at the places where `ShenandoahHeap::clear_cancelled_gc` is called, I feel the problem is more likely from op_final_update_refs:
>> 
>> 
>> void ShenandoahConcurrentGC::op_final_update_refs() {
>>   ShenandoahHeap* const heap = ShenandoahHeap::heap();
>>    ... 
>>   ...
>>   // Clear cancelled GC, if set. On cancellation path, the block before would handle
>>   // everything.
>>   if (heap->cancelled_gc()) {
>>     heap->clear_cancelled_gc();
>>   }
>>   ...
>>   ...
>> }
>> 
>> 
>> Let's say there is concurrent GC running, right before the final update refs safepoint, there is mutator allocation failure:
>> 1. The mutator tries to cancel the the concurrent GC and notify controller thread.
>> 2. The mutator block itself at `_alloc_failure_waiters_lock`, claiming safepoint safe as well. 
>> 3. concurrent GC enter the final update refs (VM operation)
>> 4. in final update refs, VMThread sees cancelled_gc and clear it.
>> 5. concurrent GC finishes, but cancelled_gc has been cleared so it won't notify the mutator. 
>> 
>> The fix seems to work in generational mode, but may not work in non-generational mode.
>
> While I was staring at the code ShenandoahController::handle_alloc_failure today, I found there is discrepancy between ShenandoahGenerationalControlThread and  ShenandoahControlThread, I created a [bug](https://bugs.openjdk.org/browse/JDK-8373468) to unify the behavior, we could fix the issue in ShenandoahControlThread there.

The scenario I described wasn't supposition, that is actually what happened in the debugger. The scenario you describe with `op_final_update_refs` would also be fixed by this PR. The `_requested_gc_cause` field should always be accessed under a lock. The code change here fixes an issue where an allocation failure might not set `_requested_gc_cause` at all.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28665#discussion_r2611167200


More information about the hotspot-gc-dev mailing list