RFR: 8367646: [GenShen] Control thread may overwrite gc cancellation cause set by mutator

Y. Srinivas Ramakrishna ysr at openjdk.org
Thu Oct 9 02:13:09 UTC 2025


On Mon, 6 Oct 2025 22:00:01 GMT, William Kemper <wkemper at openjdk.org> wrote:

> I believe the following events could lead to this assertion failure:
> 1. Control thread reads the heap's gc cancellation cause as `shenandoah_concurrent_gc` 
> 2. Mutator thread has an allocation failure and sets the heap's gc cancellation cause to `shenandoah_alloc_failure`
> 3. Control thread uses stale value from `1` and decides to unconditionally clear the cancellation cause
> 4. Mutator thread assert that gc is still cancelled
> 
> The proposed fix here has the control thread use a CAS operation to only clear the gc if the existing value is `shenandoah_concurrent_gc`. This will stop the control thread from erroneously changing the value if a mutator has already set it to `shenandoah_alloc_failure`. A mutator thread may still have an allocation failure after the control thread has cleared the cancellation, but this is normal and expected.

Left some comments that may simplify `clear_cancelled_gc()` slightly (improving comprehension complexity a bit).

I see that there is a longish comment about the evac oom handler. It might be worthwhile where the counters are cleared to explain when we are safe to clear the counnters. (May be it's explained somewhere in the block comment in `shenandoahEvacOOMHandler.hpp` class definition but it wasn't obvious when the clearing is safe.)

src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp line 116:

> 114:     if (request.cause == GCCause::_shenandoah_concurrent_gc) {
> 115:       request.generation = _heap->young_generation();
> 116:       _heap->clear_cancelled_gc(false);

It appears as if this was the _only_ case where `clear_cancelled_gc()` was called while not clearing the oom_handler.

Should `clear_cancelled_gc()` be modified by removing the option/argument for clearing or not the oom_handler?

src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp line 459:

> 457:   inline GCCause::Cause cancelled_cause() const;
> 458: 
> 459:   // Clears the cancellation cause and optionally resets the oom handler (cancelling an

See comment above re "optional reset of oom handler" etc. related simplication.

-------------

Marked as reviewed by ysr (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/27662#pullrequestreview-3316976435
PR Review Comment: https://git.openjdk.org/jdk/pull/27662#discussion_r2415375564
PR Review Comment: https://git.openjdk.org/jdk/pull/27662#discussion_r2415376309


More information about the shenandoah-dev mailing list