RFR(S): 8199813: SIGSEGV in ThreadsList::includes()
Daniel D. Daugherty
daniel.daugherty at oracle.com
Mon Mar 26 13:20:12 UTC 2018
On 3/25/18 11:21 PM, David Holmes wrote:
> On 26/03/2018 12:20 PM, Daniel D. Daugherty wrote:
>> Hi David!
>>
>> Thanks for the review. Replies embedded below...
>>
>>
>> On 3/25/18 6:44 PM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> On 24/03/2018 6:55 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a small fix for a Thread-SMR bug:
>>>>
>>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>>
>>>> Here's the webrev URL:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8199813-webrev/0-for-jdk-hs/
>>>>
>>>> The failure mode for this bug is a SIGSEGV in ThreadsList::includes().
>>>>
>>>> The cause of this failure is use of a ThreadsListHandle by a
>>>> JavaThread
>>>> that is not yet on Threads list. This failure mode is intermittent
>>>> since
>>>> the ThreadsListHandle refers to a ThreadsList that may or may not get
>>>> deleted by a racing thread. In particular, it has to get deleted in
>>>> such
>>>> a way that use of the ThreadsList causes a crash, e.g., page is freed.
>>>>
>>>> This fix has been tested on my Solaris-X64 server using my usual 24+
>>>> hour Thread-SMR stress testing config. It has also been through Mach5
>>>> hs-tier[1-5] testing. There have been no unexpected failures.
>>>>
>>>> This fix is currently going thru a Mach5 builds-tier1, jdk-tier[1-3],
>>>> and hs-tier[1-3] run after resyncing; we don't want any new asserts()
>>>> popping up in the jdk/jdk CI...
>>>>
>>>> Since this is a race fix, here are some details:
>>>>
>>>> As is usual for race fixes, there are more comments than code here.
>>>>
>>>> src/hotspot/share/runtime/thread.cpp
>>>> Use of a ThreadsListHandle by verify_not_published() before the
>>>> JavaThread is on the Thread list; this is the minimal fix for
>>>> the failure mode.
>>>
>>> Okay.
>>>
>>>> Save the shutdown thread in Threads::destroy_vm(). Also add a
>>>> comment explaining why IdealGraphPrinter::clean_up() is done
>>>> where it is.
>>>
>>> Why do you need the explicit VM_Exit::set_shutdown_thread(thread)
>>> instead of doing this within VM_Exit::set_vm_exited() as previously
>>> the case?
>>
>> For the Threads::destroy_vm() use case, we call
>> VM_Exit::set_shutdown_thread(thread) before the call to
>>
>> L4267: IdealGraphPrinter::clean_up();
>>
>> which uses a ThreadsListHandle iterator to walk the Threads list.
>>
>> The clean_up() call must be after the VMThread has done the final
>> safepoint and the shutdown thread must be known before the call
>> to clean_up(). The call to clean_up must happen before the call
>> to VM_Exit::set_vm_exited() so I had to separate the setting of
>> the shutdown thread from VM_Exit::set_vm_exited().
>>
>> Hopefully this makes (some kind of) sense.
>
> Yes I see now. That is indeed unfortunate.
Unfortunate and ugly. Sigh... shutdown is pain...
> Moving the call to VM_Exit::set_vm_exited() before cleanup() might be
> possible but increases the risk factor in a way that is not worthwhile.
Agreed.
Dan
P.S.
Looking ahead, it appears that Erik O has posted another
possible solution... so we might be going a different way
altogether...
>
> Thanks,
> David
>
>> Dan
>>
>>
>>>
>>> Otherwise no other comments.
>>>
>>> Thanks,
>>> David
>>>
>>>> src/hotspot/os/linux/os_linux.cpp
>>>> Another use of a ThreadsListHandle before the JavaThread is on
>>>> the Threads list; found during testing of the fix by the new
>>>> assert() in the ThreadsListHandle ctr.
>>>>
>>>> src/hotspot/share/runtime/threadSMR.cpp
>>>> Add an assertion to the ThreadsListHandle ctr that catches
>>>> use of a ThreadsList that Threads::threads_do() won't be
>>>> able to protect.
>>>>
>>>> Note: There is an exemption for the shutdown thread here.
>>>>
>>>> src/hotspot/share/runtime/vm_operations.hpp
>>>> Expose the shutdown thread so the new assert() in the
>>>> ThreadsListHandle ctr can do its job.
>>>>
>>>> src/hotspot/share/runtime/vm_operations.cpp
>>>> VM_Exit::_shutdown_thread should be volatile since it is
>>>> set by one thread and examined lock free by one or more
>>>> others.
>>>>
>>>> The callers of VM_Exit::set_vm_exited() should set the
>>>> _shutdown_thread field since they have the earliest context.
>>>>
>>>> Save the shutdown thread in VM_Exit::doit().
>>>>
>>>>
>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>
>>>> Dan
>>>>
>>
More information about the hotspot-runtime-dev
mailing list