RFR: JDK-8220671: Initialization race for non-JavaThread PtrQueues
Kim Barrett
kim.barrett at oracle.com
Tue Mar 19 19:48:57 UTC 2019
> On Mar 19, 2019, at 2:59 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
>
>> On Mar 19, 2019, at 8:01 AM, Roman Kennke <rkennke at redhat.com> wrote:
>>
>>
>>>>> The problem we're trying to solve is that (some implementations of)
>>>>> on_thread_attach needs to be atomic with the associated list addition
>>>>> wrto safepoints, or more precisely, with an operation performed within
>>>>> some safepoints. on_thread_attach is caching global state in the
>>>>> thread local data, while the operation is updating the global state
>>>>> and needs to also update the thread local caches. For G1 and
>>>>> Shenandoah, it's the SATB activation state; for ZGC it's the bad bit
>>>>> mask.
>>
>> Yes, it is not strictly the safepoint that is the issue. But as you say, there's state that we're reading concurrently that is otherwise written at safepoints (the SATB active flag), and not protected otherwise (e.g. by some lock that we could take when reading). There's also global flags and state updated at safepoints, and I am not sure we're good with that either if we don't let the non-Java-thread synchronize with safepoints on attach.
>>
>> Therefore I think that the way to go is to let such threads synchronize with safepoints on on_thread_attach(), and via STS. A Shenandoah-only solution that seems to do it:
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8220671/webrev.06/
>>
>> Compared to my other proposals it has the advantage that it's only changing GC specific code, and not with any global/shared code. It will not introduce STS into paths that don't synchronize this anyway (e.g. CMS).
>>
>> Unless we come up with something even better, this would be the way to go for me. If you agree, I can do the G1 change in the same fashion.
>>
>> Roman
>
> As I said over in the JDK-8221086 thread, I don’t think that works.
> It is making on_thread_attach atomic wrto safepoints, but doesn’t
> include the NJT list modifications.
The problem race, in pseudo-code:
// thread creation
STS.join
attach(this_thread)
this_thread.SATB.active = old_state
STS.leave
// Asynchronous (wrto this_thread) safepoint might not update
// this_thread's SATB.active state because this_thread is not
// yet visible to the threads_do() performing the update.
lock NJTList
this_thread._next = NJTList._head
release_store(NJTList._head, this_thread)
unlock NJTList
More information about the hotspot-gc-dev
mailing list