RFR: JDK-8220671: Initialization race for non-JavaThread PtrQueues

Kim Barrett kim.barrett at oracle.com
Wed Mar 27 19:13:00 UTC 2019


> On Mar 27, 2019, at 3:43 AM, Roman Kennke <rkennke at redhat.com> wrote:
> 
>>>> I've tested it against my stress test and Shenandoah test suite, and all seems fine.
>>>> Is the extra locking around assignment of global state needed? I think the locking in threads_do() is sufficient?
>>> 
>>> The locking over "_all_active = active;" looks sane to me. This patch has no locking in threads_do(), or am I missing something?
>>> 
>>> cheers,
>>> Per
>>> 
>>>> Roman
>> Right, there's no locking in (non_java_)threads_do, by design.
> 
> I don't see how this is supposed to work. There's code that modifies the list, and there's code that iterates the list, and in the absense of other synchronization (e.g. participation in safepoints), they need to be protected by the same locks, no? What am I missing?

You are missing the use of SingleWriterSynchronizer, which is another
RCU-inspired synchronization mechanism, similar to GlobalCounter. It is used
in the implementation of NonJavaThread iteration because NJT iteration is
used in the implementation of GlobalCounter, in such a way that the
iteration can't itself use GlobalCounter or it would self-deadlock.

>> But looking at this again, the expanded exposure of the
>> NonJavaThreadsList_lock is making me regret that it's also being used
>> to protect the synchronize() in remove_from_the_list. If an iterator
>> were to try to lock that mutex while some NonJavaThread was on its way
>> out, we'd deadlock.
> 
> Why?

NJT iteration is a synchronizer critical section. remove_from_the_list
synchronizes on iteration critical sections while holding the NJTList_lock.
So if an iteration is in progress when remove_from_the_list is called, the
synchronization therein will block until the iteration completes. But if the
iteration then attempts to lock the NJTList_lock, we have deadlock.

>> The one new use of the mutex being added by this
>> change doesn't introduce such a problem, but still... I think it's
>> easy to avoid that problem by adding a new mutex just to protect the
>> synchronize call.
> 
> Uh, no? Wouldn't that be missing the point above? We can't just 'avoid' the problem by protecting something by two mutexes that ought to be under one mutex.
> 
> Or what am I missing?

Moving that synchronization outside the NJTList_lock solves that deadlock
problem. But the synchronization itself still needs protection; this
synchronizer only supports one synchronization at a time (one of its
disadvantages compared to GlobalCounter, and the reason for "SingleWriter"
in the name). So we need a new lock, specifically for protecting those
synchronization calls.




More information about the hotspot-gc-dev mailing list