RFR (S) 8212207: runtime/InternalApi/ThreadCpuTimesDeadlock.java crashes with SEGV in pthread_getcpuclockid+0x0

Wed Nov 28 14:07:06 UTC 2018

On Wed, Nov 28, 2018 at 8:59 AM David Holmes <david.holmes at oracle.com> wrote:
>
> On 28/11/2018 5:46 pm, Thomas Stüfe wrote:
> > Are you sure?
>
> No. I'm completely confusing myself - too many different "thread id's" :(

A cheerful multitude :)

What throws me off usually is not that we have two of them, but that,
on Linux, the generalized thread id (OSThread::thread_id()) is the
kernel tid. That just feels weird, since pthread_t is what we use 99%
of all cases, and the kernel tid only in those rare cases where we
access the proc file system or similar. Well I guess this is all
historical, probably pre-NPTL?

In the AIX port we did it the other way around: made pthread_t the
default "thread_id" and use special accessors for the kernel tid
(OSThread::kernel_thread_id()) for those rare cases where we really
need it.

>
> > On Linux, pthread id is set in the parent thread, after pthread_create
> > returns. Only the kernel thread id is set by the child (I find this
> > duality confusing). thread_cpu_time uses pthread_id. So, on Linux, it
> > may or may not be set before the thread stack boundaries are
> > initialized, depending on whether the native entry function ran before
> > the parent thread continued.
>
> Yes you are right. The code is still broken but for a different reason.
>
> I may just fix this all in JDK-8214097 - Rework thread initialization
> and teardown logic - by only adding to the list after the thread has run
> its own initialization logic. Then the guard I just added to the
> management code can be deleted again.
>

I agree, the way to go long term is to make sure that no uninitialized
or cleaned up threads are in the threadlist.

I also think your small is fine for now. I was just worried that we
start using Thread::stack_size() > 0 as a synonym for something like
"Thread::fully_initialized_and_still_alive()" and that once we change
it, we need to hunt down all cases of "misused Thread::stack_size()".
Therefore I would have prepared something like:

Thread::is_alive() { return _osthread != NULL && _osthread->thread_id
!= 0 && _osthread->pthread_id != 0 && _stack_base != NULL; }

and given that this is platform-specific, probably something like:

Thread::is_alive() { return _osthread != NULL &&
_osthread->is->alive() && _stack_base != NULL; }
OSThread::is_alive() (unix) { return _pthread_id != 0 && _thread_id != 0; }

and at the end of call_run(), resetting Thread::_osthread to NULL or similar.

But I am fine with your change, if you want to check it in as it is.

..Thomas

> Thanks,
> David
>
> > ..Thomas
> >
> > On Wed, Nov 28, 2018 at 8:35 AM David Holmes <david.holmes at oracle.com> wrote:
> >>
> >> #$%$#@! I missed the fact that the thread id is set _after_ we do
> >> record_stack_base_and-size(). I thought that was in call_run() not
> >> native_thread_entry().
> >>
> >> <sigh> New bug being filed.
> >>
> >> Thanks,
> >> David
> >>
> >> On 28/11/2018 5:26 pm, David Holmes wrote:
> >>> Hi Thomas,
> >>>
> >>> On 28/11/2018 4:30 pm, Thomas Stüfe wrote:
> >>>> Hi David,
> >>>>
> >>>> I admit I still do not really get it..
> >>>>
> >>>> E.g. for GC threads:
> >>>>
> >>>> We have
> >>>>
> >>>> static ConcurrentMarkSweepThread*
> >>>> ConcurrentMarkSweepThread::start(CMSCollector* collector);
> >>>>
> >>>> which creates a ConcurrentMarkSweepThread and then calls
> >>>> os::create_thread() to hook it up with an OSThread and start it.
> >>>>
> >>>> ConcurrentMarkSweepThread derives from NonJavaThread, which in its
> >>>> constructor adds the thread object to the list.
> >>>>
> >>>> So there is a time gap where the NJT is part of the list, but
> >>>> Thread::_osthread is still NULL.
> >>>>
> >>>> In ThreadTimesClosure::do_thread(), we call
> >>>> os::thread_cpu_time(thread)->fast_cpu_time(thread)->os::Linux::pthread_getcpuclockid(thread->osthread()->pthread_id(),
> >>>>
> >>>> &clockid);
> >>>>
> >>>> Should we then not crash when dereferencing the NULL
> >>>> osthread()->pthread_id()? Why do we crash inside
> >>>> pthread_getcpuclockid?
> >>>>
> >>>> If I look further into os::create_thread(), I see that there is
> >>>> another, smaller time gap where we create OSThread and anchor it into
> >>>> the Thread object:
> >>>>
> >>>>     thread->set_osthread(osthread);
> >>>>
> >>>> and then later, after pthread_create is thru, set its thread id:
> >>>>
> >>>>       // Store pthread info into the OSThread
> >>>>       osthread->set_pthread_id(tid);
> >>>>
> >>>> When OSThread is created, we call OSThread::pd_initialize() and set
> >>>> its _threadid to 0. We do this in the constructor, before anchoring
> >>>> the OSThread in its Thread.
> >>>>
> >>>> So for my understanding, we should have two situations:
> >>>>
> >>>> (1)- NJT is in list, but its _osthread member is NULL. In that case I
> >>>> would expect a different crash.
> >>>> (2)- NJT is in list, _osthread is set, but its _thread_id is NULL.
> >>>
> >>> Yes both situations may be possible. In the related bug report only the
> >>> 0 thread_id was observed.
> >>>
> >>>> (Modulo any concurrency issues, e.g. could the
> >>>> "thread->set_osthread(osthread);" be visible before OSThread is
> >>>> initialized?
> >>>>
> >>>> Depending on what is happening, a fix for (1) would probably be a
> >>>> querying for osthread==NULL, a fix for (2) would be to guard os::---
> >>>> functions - well, at least os::thread_cpu_time() - to disregard
> >>>> Threads with pthread_t == 0. Both fixes seem better to me than
> >>>> querying the stacksize() when walking the list - that seems a bit
> >>>> awkward.
> >>>
> >>> I find it awkward that the list is populated with partially constructed
> >>> threads in the first place. This particular code is only interested in
> >>> live threads that can safely be queried. The stack_size() check is a
> >>> surrogate for "this is a live thread that can be queried".
> >>>
> >>>> --
> >>>> P.s.
> >>>>
> >>>> ConcurrentGCThread::ConcurrentGCThread() :
> >>>>     _should_terminate(false), _has_terminated(false) {
> >>>> };
> >>>>
> >>>> I was surprised to see no invocation to the base class ctor in the
> >>>> initializer list. I was not aware that this was even possible. For
> >>>> code clearness, I would prefer the call to the base class ctor to be
> >>>> explicit.)
> >>>
> >>> I assume it is implicit. But yes it should be explicit.
> >>>
> >>> Cheers,
> >>> David
> >>>
> >>>> --
> >>>>
> >>>> Cheers, Thomas
> >>>>
> >>>>
> >>>> On Wed, Nov 28, 2018 at 3:58 AM David Holmes <david.holmes at oracle.com>
> >>>> wrote:
> >>>>>
> >>>>> Sorry for the delayed response.
> >>>>>
> >>>>> On 21/11/2018 3:01 pm, Kim Barrett wrote:
> >>>>>>> On Nov 20, 2018, at 3:50 AM, David Holmes <david.holmes at oracle.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> After discussions with Kim I've decided to split out the NJT list
> >>>>>>> update into a separate RFE:
> >>>>>>>
> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8214097
> >>>>>>>
> >>>>>>> So only the change in management.cpp needs reviewing and testing.
> >>>>>>>
> >>>>>>> Updated webrev:
> >>>>>>>
> >>>>>>> http://cr.openjdk.java.net/~dholmes/8212207/webrev.v2/
> >>>>>>
> >>>>>> Looks good.
> >>>>>
> >>>>> Thanks Kim.
> >>>>>
> >>>>> I've decided to stick with this simple fix for NJTs only.
> >>>>>
> >>>>> David
> >>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> David
> >>>>>>>
> >>>>>>> On 20/11/2018 10:01 am, David Holmes wrote:
> >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8212207
> >>>>>>>> webrev: http://cr.openjdk.java.net/~dholmes/8212207/webrev/
> >>>>>>>> There is an internal management API that reports CPU times for
> >>>>>>>> NonJavaThreads (NJTs). That functionality requires a valid/live
> >>>>>>>> target thread so that we can use its pthread_t identity to obtain
> >>>>>>>> its CPU clock via pthread_getcpuclockid().
> >>>>>>>> There is an iteration mechanism for NJTs in which the NJT is
> >>>>>>>> registered during its constructor and de-registered during its
> >>>>>>>> destructor. A thread that has only been constructed has not yet
> >>>>>>>> executed and so is not a valid target for this management API.
> >>>>>>>> This seems to be the cause of failures reported in this bug (and
> >>>>>>>> JDK-8213434). Registering a NJT only when it starts executing is
> >>>>>>>> an appealing fix for this, but that impacts all current users of
> >>>>>>>> the NJT list and straight-away causes a problem with the
> >>>>>>>> BarrierSet initialization logic. So I don't attempt that.
> >>>>>>>> Instead the first part of the fix is for
> >>>>>>>> ThreadTimesClosure::do_thread to skip threads that have not yet
> >>>>>>>> executed - which we can recognize by seeing an uninitialized (i.e.
> >>>>>>>> zero) stackbase.
> >>>>>>>> A second part of the fix, which can be deferred to a separate RFE
> >>>>>>>> for NJT lifecycle management if desired, tackles the problem of
> >>>>>>>> encountering a terminated thread during iteration - which can also
> >>>>>>>> lead to SEGVs. This can arise because NJT's are not actually
> >>>>>>>> "destructed", even if they terminate, and so they never get
> >>>>>>>> removed from the NJT list. Calling destructors is problematic
> >>>>>>>> because the code using these NJTs assume they are always valid. So
> >>>>>>>> the fix in this case is to move the de-registering from the NJT
> >>>>>>>> list out of the destructor and into the Thread::call_run() method
> >>>>>>>> so it is done before a thread actually terminates. This can be
> >>>>>>>> considered a first step in cleaning up the NJT lifecycle, where
> >>>>>>>> the remaining steps touch on a lot of areas and so need to be
> >>>>>>>> handled separately e.g. see JDK-8087340 for shutting down WorkGang
> >>>>>>>> GC worker threads.
> >>>>>>>> Testing: tiers 1 -3
> >>>>>>>> I should point out that I've been unable to reproduce this failure
> >>>>>>>> locally, even after thousands of runs. I'm hoping Zhengyu can test
> >>>>>>>> this in the conditions reported in JDK-8213434.
> >>>>>>>> Thanks,
> >>>>>>>> David
> >>>>>>
> >>>>>>