RFR: 8323634: Shenandoah: Document behavior of EvacOOM protocol [v3]

Kelvin Nilsen kdnilsen at openjdk.org
Wed Jan 17 00:28:52 UTC 2024


On Tue, 16 Jan 2024 21:34:03 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

>> Even when thread A is the first thread to attempt evacuation, it will:
>> 1. Iterate through all counters and set the OOM bit on each one.  Any new thread that attempts to enter_evacuation() will check the OOM bit on its respective counter.  If the OOM bit is already set, that new thread will not be authorized to allocate.  If the thread happens to enter before this thread A has set its OOM bit, then we resolve this in the next step.
>> 2. After setting the OOM bit on each counter, we wait_for_no_evac_threads before we consider it safe to make use of a from-space pointer.
>
> So what about this scenario?
> 1. Thread A is setting the OOM bits on all counters, has decremented its own count, but has not yet finished setting all OOM bits
> 2. Thread B newly tries to enter_evacuation() so it invokes register_thread, and it finds the OOM bit is set.  It waits for no_evac_threads() and this immediately returns without authorization to evacuate, because there are no evac thread running at this moment.
> 3. Now thread C newly tries to enter_evacuation() so it invokes register thread, but thread A has not yet set this thread's OOM bit, so thread C proceeds to enter_evac with authorization to evacuate.
> 4. Thread A will wait for thread C to finish evacuating, but thread B is not waiting for thread C to finish evacuating.
> 
> This does appear to be a bug.  I think the fix is that register_thread also needs to repeat the loop that sets all OOM bits before it waits for no evac threads.
> 
> I expect it is very rare for this to occur, and it only occurs when we're already in dire circumstances, so I don't think it's worth the effort to optimize an implementation that avoids having multiple threads redundantly set the OOM bits on every counter.

Upon further reflection, I think this can't happen either.  In particular:
1. Thread B, when it waits for no evac threads, is also waiting for all counter's OOM bits to be set.
2. So Thread B will not return until thread A has set all of the OOM bits.
3. Suppose thread C's counter has value 0 because Thread A has not yet set its OOM bit.  There's a race:
    a. Thread A wants to change the counter to OOM bit
    b. Thread C wants to change the counter to 1
4. If Thread A wins the race, then thread C will proceed without authorization to allocate.
5. If Thread C wins the race, then both thread A and B will spin until thread C finishes its evacuation and decrements the counter back to zero.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17385#discussion_r1454251517


More information about the hotspot-gc-dev mailing list