RFR(S): 8199813: SIGSEGV in ThreadsList::includes()

Daniel D. Daugherty daniel.daugherty at oracle.com
Wed Mar 28 00:02:22 UTC 2018


Thanks David!

Dan


On 3/27/18 7:47 PM, David Holmes wrote:
> Looks good Dan!
>
> Thanks,
> David
>
> On 27/03/2018 11:50 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> With a little more code juggling I was able to remove the need for
>> VM_Exit::set_shutdown_thread(). Thanks to David H. for pointing out
>> how ugly that code was. Thanks to Robbin for suggesting a way to
>> remove it.
>>
>> Here's an incremental webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8199813-webrev/1-for-jdk-hs.inc/
>>
>> Here's a full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8199813-webrev/1-for-jdk-hs.full/
>>
>>
>> This version has been tested with a Mach5 builds-tier1, jdk-tier[1-3],
>> and hs-tier[1-3] run. No failures at all.
>>
>> Off-list, Erik O. has proposed a more robust check for ensuring
>> adherence to the Threads-SMR protocol that uses a closure. I'm
>> currently evaluating that change. That change may result in one
>> more tweak to this fix or it may be done via a follow-up bug
>> since it about doubles the # of new lines of code of this patch.
>>
>> Thanks, in advance, for any comments, suggestions, or feedback.
>>
>> Dan
>>
>>
>> On 3/26/18 1:28 PM, Daniel D. Daugherty wrote:
>>> Please hold off on any more reviews. I'm looking into a simplification
>>> of this fix based on some off-thread discussions...
>>>
>>> Dan
>>>
>>>
>>>
>>> On 3/26/18 9:20 AM, Daniel D. Daugherty wrote:
>>>> On 3/25/18 11:21 PM, David Holmes wrote:
>>>>> On 26/03/2018 12:20 PM, Daniel D. Daugherty wrote:
>>>>>> Hi David!
>>>>>>
>>>>>> Thanks for the review. Replies embedded below...
>>>>>>
>>>>>>
>>>>>> On 3/25/18 6:44 PM, David Holmes wrote:
>>>>>>> Hi Dan,
>>>>>>>
>>>>>>> On 24/03/2018 6:55 AM, Daniel D. Daugherty wrote:
>>>>>>>> Greetings,
>>>>>>>>
>>>>>>>> I have a small fix for a Thread-SMR bug:
>>>>>>>>
>>>>>>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>>>>>>
>>>>>>>> Here's the webrev URL:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dcubed/8199813-webrev/0-for-jdk-hs/
>>>>>>>>
>>>>>>>> The failure mode for this bug is a SIGSEGV in 
>>>>>>>> ThreadsList::includes().
>>>>>>>>
>>>>>>>> The cause of this failure is use of a ThreadsListHandle by a 
>>>>>>>> JavaThread
>>>>>>>> that is not yet on Threads list. This failure mode is 
>>>>>>>> intermittent since
>>>>>>>> the ThreadsListHandle refers to a ThreadsList that may or may 
>>>>>>>> not get
>>>>>>>> deleted by a racing thread. In particular, it has to get 
>>>>>>>> deleted in such
>>>>>>>> a way that use of the ThreadsList causes a crash, e.g., page is 
>>>>>>>> freed.
>>>>>>>>
>>>>>>>> This fix has been tested on my Solaris-X64 server using my 
>>>>>>>> usual 24+
>>>>>>>> hour Thread-SMR stress testing config. It has also been through 
>>>>>>>> Mach5
>>>>>>>> hs-tier[1-5] testing. There have been no unexpected failures.
>>>>>>>>
>>>>>>>> This fix is currently going thru a Mach5 builds-tier1, 
>>>>>>>> jdk-tier[1-3],
>>>>>>>> and hs-tier[1-3] run after resyncing; we don't want any new 
>>>>>>>> asserts()
>>>>>>>> popping up in the jdk/jdk CI...
>>>>>>>>
>>>>>>>> Since this is a race fix, here are some details:
>>>>>>>>
>>>>>>>> As is usual for race fixes, there are more comments than code 
>>>>>>>> here.
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/thread.cpp
>>>>>>>>      Use of a ThreadsListHandle by verify_not_published() 
>>>>>>>> before the
>>>>>>>>      JavaThread is on the Thread list; this is the minimal fix for
>>>>>>>>      the failure mode.
>>>>>>>
>>>>>>> Okay.
>>>>>>>
>>>>>>>>      Save the shutdown thread in Threads::destroy_vm(). Also add a
>>>>>>>>      comment explaining why IdealGraphPrinter::clean_up() is done
>>>>>>>>      where it is.
>>>>>>>
>>>>>>> Why do you need the explicit 
>>>>>>> VM_Exit::set_shutdown_thread(thread) instead of doing this 
>>>>>>> within VM_Exit::set_vm_exited() as previously the case?
>>>>>>
>>>>>> For the Threads::destroy_vm() use case, we call
>>>>>> VM_Exit::set_shutdown_thread(thread) before the call to
>>>>>>
>>>>>>   L4267:   IdealGraphPrinter::clean_up();
>>>>>>
>>>>>> which uses a ThreadsListHandle iterator to walk the Threads list.
>>>>>>
>>>>>> The clean_up() call must be after the VMThread has done the final
>>>>>> safepoint and the shutdown thread must be known before the call
>>>>>> to clean_up(). The call to clean_up must happen before the call
>>>>>> to VM_Exit::set_vm_exited() so I had to separate the setting of
>>>>>> the shutdown thread from VM_Exit::set_vm_exited().
>>>>>>
>>>>>> Hopefully this makes (some kind of) sense.
>>>>>
>>>>> Yes I see now. That is indeed unfortunate.
>>>>
>>>> Unfortunate and ugly. Sigh... shutdown is pain...
>>>>
>>>>
>>>>> Moving the call to VM_Exit::set_vm_exited() before cleanup() might 
>>>>> be possible but increases the risk factor in a way that is not 
>>>>> worthwhile.
>>>>
>>>> Agreed.
>>>>
>>>> Dan
>>>>
>>>> P.S.
>>>> Looking ahead, it appears that Erik O has posted another
>>>> possible solution... so we might be going a different way
>>>> altogether...
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Otherwise no other comments.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> src/hotspot/os/linux/os_linux.cpp
>>>>>>>>      Another use of a ThreadsListHandle before the JavaThread 
>>>>>>>> is on
>>>>>>>>      the Threads list; found during testing of the fix by the new
>>>>>>>>      assert() in the ThreadsListHandle ctr.
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/threadSMR.cpp
>>>>>>>>      Add an assertion to the ThreadsListHandle ctr that catches
>>>>>>>>      use of a ThreadsList that Threads::threads_do() won't be
>>>>>>>>      able to protect.
>>>>>>>>
>>>>>>>>      Note: There is an exemption for the shutdown thread here.
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/vm_operations.hpp
>>>>>>>>      Expose the shutdown thread so the new assert() in the
>>>>>>>>      ThreadsListHandle ctr can do its job.
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/vm_operations.cpp
>>>>>>>>      VM_Exit::_shutdown_thread should be volatile since it is
>>>>>>>>      set by one thread and examined lock free by one or more
>>>>>>>>      others.
>>>>>>>>
>>>>>>>>      The callers of VM_Exit::set_vm_exited() should set the
>>>>>>>>      _shutdown_thread field since they have the earliest context.
>>>>>>>>
>>>>>>>>      Save the shutdown thread in VM_Exit::doit().
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>>
>>



More information about the hotspot-runtime-dev mailing list