RFR: 8373819: GenShen: Requested generation may be null [v4]

Y. Srinivas Ramakrishna ysr at openjdk.org
Sat Jan 10 00:17:43 UTC 2026


On Fri, 9 Jan 2026 23:57:10 GMT, Y. Srinivas Ramakrishna <ysr at openjdk.org> wrote:

>> William Kemper has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 14 commits:
>> 
>>  - Merge remote-tracking branch 'jdk/master' into fix-null-generation-crash
>>  - Merge remote-tracking branch 'jdk/master' into fix-null-generation-crash
>>  - Fix typo in assertion message
>>  - Take regulator thread out of STS before requesting GC
>>    
>>    The request may block while it waits for control thread to stop old marking. If workers are already in the STS, and the regulator thread is still in the STS, but cannot yield, the safepoint will not run. Control, worker and regulator threads deadlock each other.
>>  - Add comments
>>  - Revert back to what should be on this branch
>>  - Merge remote-tracking branch 'jdk/master' into fix-null-generation-crash
>>  - Don't know how this file got deleted
>>  - Carry over gc cancellation to gc request
>>  - Do not let allocation failure requests be overwritten by other requests
>>  - ... and 4 more: https://git.openjdk.org/jdk/compare/f5fa9e40...2e57f0ac
>
> src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp line 95:
> 
>> 93:   notify_gc_waiters();
>> 94:   notify_alloc_failure_waiters();
>> 95:   set_gc_mode(stopped);
> 
> Will they "observe the shutdown" if mode isn't "stopped"? Should line 95 move before line 93? Also is `gc_mode`'s state transitions protected in any manner, so they are consistent with any other observable state from the standpoint of threads that may interact with the controller? I see for example that the RegulatorThread looks at the controller thread's gc_mode to make certain GC triggering decisions, but it doesn't look like the reading of the mode and the requesting of a GC are protected by a lock that prevents races/glitches. In general such interactions lead to an increase in the state-space of interactions that the parties need to deal with correctly.

The reason I am leaving this comment here is that further above (lines 85-88), and in many of the other mode changes we seem to take care to use the control lock to protect these transitions so that other parties may observe the right state. Perhaps that is not needed in some cases, but the plurality of different forms of update of state that is modifed and may be read by other threads in conjunction with other state leaves me feeling queasy about the possible cracks in the coordination surface that may open us up to trouble.

> src/hotspot/share/gc/shenandoah/shenandoahGenerationalControlThread.cpp line 708:
> 
>> 706: 
>> 707: void ShenandoahGenerationalControlThread::notify_control_thread(MonitorLocker& ml, GCCause::Cause cause, ShenandoahGeneration* generation) {
>> 708:   assert(_control_lock.is_locked(), "Request lock must be held here");
> 
> What is MonitorLocker, and what is `_control_lock`? Why are we checking `_control_lock` here? If ml is being passed here for the purposes of notification on it, it must be the case that it's locked and the assert at line 708 is a leakage of abstraction? I see that your problem here is that along one path you do nothing and and don't post a notification and along another you do some work and post a notification. It sounds like what you want instead is an API of the following shape:
> 
> 
>    bool should_notify_control_thread(cause, generation) { ... };
> 
> 
> and callers might do:
> 
> 
>    MonitorLocker ml(...);
>    if (should_notify_control_thread(cause, generation)) {
>       ml.notify();
>    }
>    ml.wait();

In addition, all of the various flavours of `notify_control_thread()` with optional parameters should  ideally call down into the version that has all of the parameters specified. In the cases where these parameters aren't specified, the version of the method fills default parameters.

The fully-parameterized version of the method then contains and consolidates the entire logic and any invariant/assertion checking of the various parameters that make sense, and executes the relevant logic to return a boolean result to allow the caller to notify and/or wait on the monitor.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28932#discussion_r2677974288
PR Review Comment: https://git.openjdk.org/jdk/pull/28932#discussion_r2677875997


More information about the hotspot-gc-dev mailing list