RFR (S) 8213092: Add more runtime locks for concurrent class unloading
david.holmes at oracle.com
Wed Oct 31 00:56:47 UTC 2018
On 30/10/2018 11:16 PM, coleen.phillimore at oracle.com wrote:
> On 10/30/18 8:36 AM, David Holmes wrote:
>> On 30/10/2018 10:24 PM, coleen.phillimore at oracle.com wrote:
>>> On 10/30/18 12:17 AM, David Holmes wrote:
>>>> Hi Coleen,
>>>> On 30/10/2018 1:45 PM, coleen.phillimore at oracle.com wrote:
>>>>> Summary: Add locks for calling CLDG::purge concurrently as well and
>>>>> for calling SystemDictionary::do_unloading concurrently.
>>>>> Ran linux-x64 tier1-6 through mach5 and hotspot/jtreg/runtime
>>>>> tests, which include the module tests.
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8213092.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8213092
>>>> So ... all the locks covered by an assert_locked_or_safepoint, or
>>>> which are acquired by the VMThread at a safepoint, must never be
>>>> held by a JavaThread if it could reach a safepoint whilst that lock
>>>> is held - else we could deadlock. So can we check that with
>>> Actually I think this is not possible to add NSV. You can acquire
>>> the ClassLoaderDataGraph_lock and then the Module_lock. The latter
>>> would check for a safepoint also for a Java thread. This is currently
>>> done for Jvmti and JFR, but not in other code that I can see. I
>>> don't actually know how to fix this problem.
>> This seems risky. If a JavaThread can hold the CLDG_lock while blocked
>> at a safepoint (acquiring the Module_lock), then what is to stop the
>> VMThread from hitting one of these sections of code protected by
>> locked_or_safepoint and then proceeding into what is effectively a
>> critical section (by virtue of there being a safepoint) when the
>> JavaThread is itself in the midst of a critical section? Do we
>> actively take steps to prevent this somehow, or to make it safe for
>> the VMThread to proceed?
> No we don't. I think we have this problem today (not introduced or made
> worse by this patch).
It is made worse by this patch. Before this patch you have code only
executed by the VMThread at a safepoint. After the patch you have code
that may be executed by the VMThread at a safepoint or by another thread
under a lock (and there may or may not also be a safepoint). And this
allows the VMThread to execute the same code at the same as a
lock-holding thread executes it!
> I'll file an bug to fix it and hopefully add
> detection for this. I think we don't need to take CLDG_lock in a
> safepoint and should prevent doing so, but things like this are more
> reliable to do with computers than visual inspection. Can you add your
> suggestions to my RFE? https://bugs.openjdk.java.net/browse/JDK-8213150
This needs to be fixed before enabling the current patch IMHO. At least
the two usages you mention in your follow up email need some kind of
guard to ensure a safepoint can't be reached while the lock is held. Or
the VMThread must also acquire the lock(s) - though if NJTs are involved
there is a risk of breakage due to lock sneaking!
> The CLDG lock can be shared by non-java threads and java threads, which
> is the point. There may be other locks though.
>>> The locks added in this patch set though are for the NonJavaThreads,
>>> who do not do safepoint checks. The NonJavaThreads that acquire
>>> these locks use the STS joiner mechanism which disallows safepoints
>>> while being held (and since they are non Java threads, do not check
>>> for safepoints themselves).
>>> This is how it's going to look for the ZGC caller:
>>> SuspendibleThreadSetJoiner sts_joiner;
>>> // Unlink the classes.
>>> MutexLockerEx ml(ClassLoaderDataGraph_lock);
>>> unloading_occurred = SystemDictionary::do_unloading(ZStatPhase::timer(),
>>> true /* do_cleaning */););
>> Somehow I missed the creation/invention of the STS joiner mechanism.
> Me too! It's in gc/shared but it's really runtime code, except people
> in runtime didn't know about it because it's used by GC threads.
> Let me know if you have more questions and can review this code.
>>>> Further, are these locks acquired by non-JavaThreads such that the
>>>> VMThread may be delayed whilst a safepoint is active?
>>> Yes, theoretically they could delay the VMThread from getting to a
>>> safepoint or doing its work while in a safepoint but the threads that
>>> take out these locks only hold them for short durations.
More information about the hotspot-runtime-dev