RFR(S): 8199813: SIGSEGV in ThreadsList::includes()

Gerald Thornbrugh gerald.thornbrugh at oracle.com
Tue Mar 27 14:56:29 UTC 2018


Hi Dan,

This is definitely a better approach, your changes look good to me.

Jerry

> On Mar 27, 2018, at 7:50 AM, Daniel D. Daugherty <daniel.daugherty at oracle.com> wrote:
> 
> Greetings,
> 
> With a little more code juggling I was able to remove the need for
> VM_Exit::set_shutdown_thread(). Thanks to David H. for pointing out
> how ugly that code was. Thanks to Robbin for suggesting a way to
> remove it.
> 
> Here's an incremental webrev:
> 
> http://cr.openjdk.java.net/~dcubed/8199813-webrev/1-for-jdk-hs.inc/
> 
> Here's a full webrev:
> 
> http://cr.openjdk.java.net/~dcubed/8199813-webrev/1-for-jdk-hs.full/
> 
> 
> This version has been tested with a Mach5 builds-tier1, jdk-tier[1-3],
> and hs-tier[1-3] run. No failures at all.
> 
> Off-list, Erik O. has proposed a more robust check for ensuring
> adherence to the Threads-SMR protocol that uses a closure. I'm
> currently evaluating that change. That change may result in one
> more tweak to this fix or it may be done via a follow-up bug
> since it about doubles the # of new lines of code of this patch.
> 
> Thanks, in advance, for any comments, suggestions, or feedback.
> 
> Dan
> 
> 
> On 3/26/18 1:28 PM, Daniel D. Daugherty wrote:
>> Please hold off on any more reviews. I'm looking into a simplification
>> of this fix based on some off-thread discussions...
>> 
>> Dan
>> 
>> 
>> 
>> On 3/26/18 9:20 AM, Daniel D. Daugherty wrote:
>>> On 3/25/18 11:21 PM, David Holmes wrote:
>>>> On 26/03/2018 12:20 PM, Daniel D. Daugherty wrote:
>>>>> Hi David!
>>>>> 
>>>>> Thanks for the review. Replies embedded below...
>>>>> 
>>>>> 
>>>>> On 3/25/18 6:44 PM, David Holmes wrote:
>>>>>> Hi Dan,
>>>>>> 
>>>>>> On 24/03/2018 6:55 AM, Daniel D. Daugherty wrote:
>>>>>>> Greetings,
>>>>>>> 
>>>>>>> I have a small fix for a Thread-SMR bug:
>>>>>>> 
>>>>>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>>>>> 
>>>>>>> Here's the webrev URL:
>>>>>>> 
>>>>>>> http://cr.openjdk.java.net/~dcubed/8199813-webrev/0-for-jdk-hs/
>>>>>>> 
>>>>>>> The failure mode for this bug is a SIGSEGV in ThreadsList::includes().
>>>>>>> 
>>>>>>> The cause of this failure is use of a ThreadsListHandle by a JavaThread
>>>>>>> that is not yet on Threads list. This failure mode is intermittent since
>>>>>>> the ThreadsListHandle refers to a ThreadsList that may or may not get
>>>>>>> deleted by a racing thread. In particular, it has to get deleted in such
>>>>>>> a way that use of the ThreadsList causes a crash, e.g., page is freed.
>>>>>>> 
>>>>>>> This fix has been tested on my Solaris-X64 server using my usual 24+
>>>>>>> hour Thread-SMR stress testing config. It has also been through Mach5
>>>>>>> hs-tier[1-5] testing. There have been no unexpected failures.
>>>>>>> 
>>>>>>> This fix is currently going thru a Mach5 builds-tier1, jdk-tier[1-3],
>>>>>>> and hs-tier[1-3] run after resyncing; we don't want any new asserts()
>>>>>>> popping up in the jdk/jdk CI...
>>>>>>> 
>>>>>>> Since this is a race fix, here are some details:
>>>>>>> 
>>>>>>> As is usual for race fixes, there are more comments than code here.
>>>>>>> 
>>>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>>>      Use of a ThreadsListHandle by verify_not_published() before the
>>>>>>>      JavaThread is on the Thread list; this is the minimal fix for
>>>>>>>      the failure mode.
>>>>>> 
>>>>>> Okay.
>>>>>> 
>>>>>>>      Save the shutdown thread in Threads::destroy_vm(). Also add a
>>>>>>>      comment explaining why IdealGraphPrinter::clean_up() is done
>>>>>>>      where it is.
>>>>>> 
>>>>>> Why do you need the explicit VM_Exit::set_shutdown_thread(thread) instead of doing this within VM_Exit::set_vm_exited() as previously the case?
>>>>> 
>>>>> For the Threads::destroy_vm() use case, we call
>>>>> VM_Exit::set_shutdown_thread(thread) before the call to
>>>>> 
>>>>>   L4267:   IdealGraphPrinter::clean_up();
>>>>> 
>>>>> which uses a ThreadsListHandle iterator to walk the Threads list.
>>>>> 
>>>>> The clean_up() call must be after the VMThread has done the final
>>>>> safepoint and the shutdown thread must be known before the call
>>>>> to clean_up(). The call to clean_up must happen before the call
>>>>> to VM_Exit::set_vm_exited() so I had to separate the setting of
>>>>> the shutdown thread from VM_Exit::set_vm_exited().
>>>>> 
>>>>> Hopefully this makes (some kind of) sense.
>>>> 
>>>> Yes I see now. That is indeed unfortunate.
>>> 
>>> Unfortunate and ugly. Sigh... shutdown is pain...
>>> 
>>> 
>>>> Moving the call to VM_Exit::set_vm_exited() before cleanup() might be possible but increases the risk factor in a way that is not worthwhile.
>>> 
>>> Agreed.
>>> 
>>> Dan
>>> 
>>> P.S.
>>> Looking ahead, it appears that Erik O has posted another
>>> possible solution... so we might be going a different way
>>> altogether...
>>> 
>>> 
>>>> 
>>>> Thanks,
>>>> David
>>>> 
>>>>> Dan
>>>>> 
>>>>> 
>>>>>> 
>>>>>> Otherwise no other comments.
>>>>>> 
>>>>>> Thanks,
>>>>>> David
>>>>>> 
>>>>>>> src/hotspot/os/linux/os_linux.cpp
>>>>>>>      Another use of a ThreadsListHandle before the JavaThread is on
>>>>>>>      the Threads list; found during testing of the fix by the new
>>>>>>>      assert() in the ThreadsListHandle ctr.
>>>>>>> 
>>>>>>> src/hotspot/share/runtime/threadSMR.cpp
>>>>>>>      Add an assertion to the ThreadsListHandle ctr that catches
>>>>>>>      use of a ThreadsList that Threads::threads_do() won't be
>>>>>>>      able to protect.
>>>>>>> 
>>>>>>>      Note: There is an exemption for the shutdown thread here.
>>>>>>> 
>>>>>>> src/hotspot/share/runtime/vm_operations.hpp
>>>>>>>      Expose the shutdown thread so the new assert() in the
>>>>>>>      ThreadsListHandle ctr can do its job.
>>>>>>> 
>>>>>>> src/hotspot/share/runtime/vm_operations.cpp
>>>>>>>      VM_Exit::_shutdown_thread should be volatile since it is
>>>>>>>      set by one thread and examined lock free by one or more
>>>>>>>      others.
>>>>>>> 
>>>>>>>      The callers of VM_Exit::set_vm_exited() should set the
>>>>>>>      _shutdown_thread field since they have the earliest context.
>>>>>>> 
>>>>>>>      Save the shutdown thread in VM_Exit::doit().
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>>>> 
>>>>>>> Dan
>>>>>>> 
>>>>> 
>>> 
>>> 
>> 
>> 
> 



More information about the hotspot-runtime-dev mailing list