RFR (S) 8213092: Add more runtime locks for concurrent class unloading

David Holmes david.holmes at oracle.com
Wed Oct 31 08:42:10 UTC 2018


Hi Erik,

So the magic STS Joiner ensures everything works okay. But how can I 
tell from the code that you are in fact using the STS Joiner? Can it be:

assert_locked_by_sts_joiner_or_safepoint(...);

? Is that at all possible?

I don't know when SuspendibleThreadSet came into play or exactly how it 
operates.

Thanks,
David

On 31/10/2018 6:19 PM, Erik Österlund wrote:
> Hi David,
> 
> On 2018-10-31 01:56, David Holmes wrote:
>> On 30/10/2018 11:16 PM, coleen.phillimore at oracle.com wrote:
>>> On 10/30/18 8:36 AM, David Holmes wrote:
>>>> On 30/10/2018 10:24 PM, coleen.phillimore at oracle.com wrote:
>>>>> On 10/30/18 12:17 AM, David Holmes wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> On 30/10/2018 1:45 PM, coleen.phillimore at oracle.com wrote:
>>>>>>> Summary: Add locks for calling CLDG::purge concurrently as well 
>>>>>>> and for calling SystemDictionary::do_unloading concurrently.
>>>>>>>
>>>>>>> Ran linux-x64 tier1-6 through mach5 and hotspot/jtreg/runtime 
>>>>>>> tests, which include the module tests.
>>>>>>>
>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8213092.01/webrev
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8213092
>>>>>>
>>>>>> So ... all the locks covered by an assert_locked_or_safepoint, or 
>>>>>> which are acquired by the VMThread at a safepoint, must never be 
>>>>>> held by a JavaThread if it could reach a safepoint whilst that 
>>>>>> lock is held - else we could deadlock. So can we check that with 
>>>>>> NoSafepointVerifiers?
>>>>>
>>>>> Actually I think this is not possible to add NSV.   You can acquire 
>>>>> the ClassLoaderDataGraph_lock and then the Module_lock. The latter 
>>>>> would check for a safepoint also for a Java thread. This is 
>>>>> currently done for Jvmti and JFR, but not in other code that I can 
>>>>> see.  I don't actually know how to fix this problem.
>>>>
>>>> This seems risky. If a JavaThread can hold the CLDG_lock while 
>>>> blocked at a safepoint (acquiring the Module_lock), then what is to 
>>>> stop the VMThread from hitting one of these sections of code 
>>>> protected by locked_or_safepoint and then proceeding into what is 
>>>> effectively a critical section (by virtue of there being a 
>>>> safepoint) when the JavaThread is itself in the midst of a critical 
>>>> section? Do we actively take steps to prevent this somehow, or to 
>>>> make it safe for the VMThread to proceed?
>>>
>>> No we don't.  I think we have this problem today (not introduced or 
>>> made worse by this patch).
>>
>> It is made worse by this patch. Before this patch you have code only 
>> executed by the VMThread at a safepoint. After the patch you have code 
>> that may be executed by the VMThread at a safepoint or by another 
>> thread under a lock (and there may or may not also be a safepoint). 
>> And this allows the VMThread to execute the same code at the same as a 
>> lock-holding thread executes it!
> 
> The scenario you describe can not happen. The code you are referring to 
> is executed either by the VM thread in a safepoint, or by a concurrent 
> GC thread outside of a safepoint, holding the STS joiner, and hence 
> effectively blocking concurrent safepoints from entering the scope. The 
> two can never run concurrent to each other.
> 
> I tried going down the path of not using the STS joiner in an earlier 
> prototype, but it required major rewrites due to too many assumptions 
> being violated. And yet I could not convince myself what I was doing was 
> safe, due to possibly strange interactions with crazy safepoint 
> operations, like redefinition. So I dropped that and decided to protect 
> these operations with the STS joiner instead.
> 
>>> I'll file an bug to fix it and hopefully add detection for this.  I 
>>> think we don't need to take CLDG_lock in a safepoint and should 
>>> prevent doing so, but things like this are more reliable to do with 
>>> computers than visual inspection.  Can you add your suggestions to my 
>>> RFE? https://bugs.openjdk.java.net/browse/JDK-8213150
>>
>> This needs to be fixed before enabling the current patch IMHO. At 
>> least the two usages you mention in your follow up email need some 
>> kind of guard to ensure a safepoint can't be reached while the lock is 
>> held. Or the VMThread must also acquire the lock(s) - though if NJTs 
>> are involved there is a risk of breakage due to lock sneaking!
> 
> These locks are taken under the STS joiner, specifically to dodge the 
> issues you refer to. The STS joiner makes sure that the locks you are 
> taking do not suffer from bad lock sneaking interactions, because they 
> do not take the locks concurrent to the VM thread. So we have the safety 
> we need. I agree that the CLDG lock does not need to be taken by the VM 
> thread in a safepoint, simply because it is unnecessary and risky. But 
> that is orthogonal to this patch.
> 
> Thanks,
> /Erik
> 
>> David
>> -----
>>
>>> The CLDG lock can be shared by non-java threads and java threads, 
>>> which is the point.  There may be other locks though.
>>
>>
>>
>>>>
>>>>>
>>>>> The locks added in this patch set though are for the 
>>>>> NonJavaThreads, who do not do safepoint checks.  The NonJavaThreads 
>>>>> that acquire these locks use the STS joiner mechanism which 
>>>>> disallows safepoints while being held (and since they are non Java 
>>>>> threads, do not check for safepoints themselves).
>>>>>
>>>>> This is how it's going to look for the ZGC caller:
>>>>>
>>>>> {
>>>>> SuspendibleThreadSetJoiner sts_joiner;
>>>>> // Unlink the classes.
>>>>> MutexLockerEx ml(ClassLoaderDataGraph_lock);
>>>>> unloading_occurred = 
>>>>> SystemDictionary::do_unloading(ZStatPhase::timer(),
>>>>> true /* do_cleaning */););
>>>>> }
>>>>
>>>> Somehow I missed the creation/invention of the STS joiner mechanism.
>>>>
>>>
>>> Me too!   It's in gc/shared but it's really runtime code, except 
>>> people in runtime didn't know about it because it's used by GC threads.
>>>
>>> Let me know if you have more questions and can review this code.
>>>
>>> Thanks,
>>> Coleen
>>>
>>>> David
>>>>
>>>>>
>>>>>>
>>>>>> Further, are these locks acquired by non-JavaThreads such that the 
>>>>>> VMThread may be delayed whilst a safepoint is active?
>>>>>
>>>>> Yes, theoretically they could delay the VMThread from getting to a 
>>>>> safepoint or doing its work while in a safepoint but the threads 
>>>>> that take out these locks only hold them for short durations.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>
>>>


More information about the hotspot-runtime-dev mailing list