Question about thread initialization
Thomas Stüfe
thomas.stuefe at gmail.com
Mon Oct 15 15:41:21 UTC 2018
Hi David,
On Mon, Oct 15, 2018 at 7:32 AM David Holmes <david.holmes at oracle.com> wrote:
>
> On 13/10/2018 5:49 PM, Thomas Stüfe wrote:
> > On Sat, Oct 13, 2018 at 8:36 AM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
> >>
> >> Hi David,
> >>
> >> On Sat, Oct 13, 2018 at 1:42 AM David Holmes <david.holmes at oracle.com> wrote:
> >>>
> >>> Hi Thomas,
> >>>
> >>> On 13/10/2018 3:56 AM, Thomas Stüfe wrote:
> >>>> Hi David,
> >>>>
> >>>> thank you for your brain cycles.
> >>>>
> >>>> On Fri, Oct 12, 2018 at 12:39 PM David Holmes <david.holmes at oracle.com> wrote:
> >>>>>
> >>>>> Hi Thomas,
> >>>>>
> >>>>> On 12/10/2018 6:18 PM, Thomas Stüfe wrote:
> >>>>>> Hi all,
> >>>>>>
> >>>>>> a small question.
> >>>>>>
> >>>>>> JVM_StartThread calls new JavaThread()
> >>>>>> JavaThread::JavaThread() calls os::create_thread()
> >>>>>> os::create_thread() starts the new thread and waits for the
> >>>>>> handshake, then returns
> >>>>>>
> >>>>>> Back in JVM_StartThread, we call JavaThread::prepare(), which adds the
> >>>>>> new thread to the Threads list. By that time, the new thread is
> >>>>>> already running, but how far it has gotten is unknown.
> >>>>>
> >>>>> Right - though the new thread can't enter run() until after it has been
> >>>>> prepared and "started". There are two parts to the handshake:
> >>>>>
> >>>>> Parent thread New Thread
> >>>>> start new Thread
> >>>>> wait for new thread
> >>>>> signal parent
> >>>>> wait for parent
> >>>>> prepare new thread
> >>>>> "start" new thread
> >>>>> signal new thread
> >>>>> run()
> >>>>>
> >>>>
> >>>> Ah, I see. The new thread is taken off the leash only at the end of
> >>>> JVM_StartThread(), when Thread::start() is called and the final part
> >>>> of the handshake is completed.
> >>>>
> >>>>>> The new thread's stack dimensions are set from within Thread::run()
> >>>>>> (for some reason, every child class does this on its own?) by calling
> >>>>>> Thread::record_stack_base_and_size(). So, after the handshake with its
> >>>>>> parent thread. Why?
> >>>>>
> >>>>> Good question. Undoubtedly historical. :)
> >>>>>
> >>>>>> This means we have a little race: in the Threads list there may be
> >>>>>> threads which have been just created and Thread::run() did not yet get
> >>>>>> around to set the stack size. In tests I stumbled over this, very
> >>>>>> rarely, when iterating Threads to check the stack sizes.
> >>>>>
> >>>>> Hmmm. Threads that are still _thread_new should really be invisible
> >>>>> until more fully initialized. How exactly were you iterating the
> >>>>> threads? This might be an oversight in the related APIs.
> >>>>>
> >>>>
> >>>> This was because of a little unrelated test I wrote to accumulate the
> >>>> combined thread stack sizes. They did not add up and I was confused.
> >>>>
> >>>> I am now less confused:
> >>>>
> >>>> In JVM_StartThread():
> >>>>
> >>>> {
> >>>> grab thread list lock
> >>>> new JavaThread()
> >>>> - calls pthread_create, new thread starts and waits for
> >>>> handshake. stack base, size still NULL,0.
> >>>> JavaThread::prepare()
> >>>> - thread gets added to Threads list
> >>>> a } // relinquish threads lock
> >>>> ...
> >>>> ...
> >>>> Threads::start()
> >>>> - signal new thread to run
> >>>> b - new thread completes handshake, calls Thread::run(), and all
> >>>> implementations pretty much immediately set the stackbase/size.
> >>>>
> >>>> Between (a) and (b) another thread could grab the Threads lock,
> >>>> iterate the threads and would see the new thread with uninitialized
> >>>> base/size.
> >>>>
> >>>> To prove this I did a little test right there, in JVM_StartThread, after (a):
> >>>>
> >>>> + {
> >>>> + MutexLocker mu(Threads_lock);
> >>>> + MyThreadClosure tc;
> >>>> + Threads::threads_do(&tc);
> >>>> + }
> >>>>
> >>>> the thread closure just prints the stack dimensions of the thrad. Sure
> >>>> enough, the new thread still has NULL/0 for stack base/size. In fact,
> >>>> Thread::stack_base() asserts because of this.
> >>>>
> >>>> So, could not another thread happen to do the same in this little time interval?
> >>>
> >>> There _should_ be some rules about how thread closures gather their
> >>> target threads based on the thread state. Part of that should (now that
> >>> I think about it) filter out threads that are still "new" - but whether
> >>> they do or not is a different matter. This may well just be an oversight.
> >>>
> >>
> >> I think so too.
> >>
> >>>>>> Is there any reason why we could not just call
> >>>>>> record_stack_base_and_size() before calling Thread::run(), right at
> >>>>>> the start of the native entry function (thread_native_entry in the
> >>>>>> case of Linux)?
> >>>>>
> >>>>> Have you tried it? :)
> >>>>>
> >>>>> I can't immediately see why this can't happen in the
> >>>>> thread_native_entry. It's possible there was once a dependency with
> >>>>> something in thread preparation etc, but that's just speculation on my part.
> >>>>>
> >>>>
> >>>> Funny, I wanted to test it and then it occurred to me that we do this
> >>>> all along already on AIX: At the entrace of thread_native_entry I set
> >>>> stack base and size (that was even my own code from the initial AIX
> >>>> port). Unfortunately I have lost the history for that change and do
> >>>> not know anymore why I did this.
> >>>>
> >>>> Since we never had problems on AIX I guess this is okay for other
> >>>> platforms too - as long os::current_stack_base() /
> >>>> current_stack_size() have no side effects and do not rely on anything
> >>>> but OS functionality.
> >>>
> >>> Okay so are you going to propose making such a change?
> >>>
> >>
> >> Yes. I'll prepare a webrev.
> >>
> >
> > Yikes Thread::record_stack_base_and_size() is complex. That thing's
> > name is a blatant lie, it does way more than that, especially on
> > Solaris.
>
> To be fair the problem is in the naming of os::initialize_thread() and
> the fact that on Solaris it has also been used for additional thread
> initialization when only intended for fixing up issues with the
> primordial thread's stack. And it seems only an issue with Solaris as
> everywhere else it is a empty method (and I'm unclear how it ended up as
> an <os>_<cpu>.cpp function!)
>
Okay maybe I overreacted. The function name was probably well choosen
some time in the past and then the function grew in scope.
> I would think you can move record_stack_base_and_size() in
> thread_native_entry; delete initialize_thread(Thread t) completely. Put
> the code that was in the Solaris initialize_thread into the Solaris
> naitve_thread_entry after record_stack_base_and_size().
>
Replacing os::initialize_thread() with a Solaris-only local solution
seems clean and very reasonable.
I am not so sure of moving the call to
Thread::record_stack_base_and_size() up to thread_native_entry, not in
its current form.
I worry about this part:
#if INCLUDE_NMT
// record thread's native stack, stack grows downward
MemTracker::record_thread_stack(stack_end(), stack_size());
#endif // INCLUDE_NMT
log_debug(os, thread)("Thread " UINTX_FORMAT " stack dimensions: "
PTR_FORMAT "-" PTR_FORMAT " (" SIZE_FORMAT "k).",
os::current_thread_id(), p2i(stack_base() - stack_size()),
p2i(stack_base()), stack_size()/1024);
Both NMT recording and logging are probably sensitive to the current
Thread not fully initialized or Thread::current() not set at all
(depending on how early we want to set stack_base and size). So I
still prefer to chop this function in two, move the stack_base and
size part up to the start of native_thread_entry, and leaving the rest
where it is, under a new name.
> > I feel unsure about moving this up to the start of the native entry
> > function, since it is not aware of its surrounding Thread class being
> > only half initialized.
> >
> > On AIX, we just call "set_stack_base" and "set_stack_size" directly,
> > and later, in shared code, run thru
> > Thread::record_stack_base_and_size() like everyone else. So, we just
> > pre-initialized base and size. I do not want to do this for all
> > platforms, since this is ugly.
> >
> > I wonder whether a better solution would be to change
> > Thread::record_stack_base_and_size():
> >
> > 1 move the calls to set_stack_base() and set_stack_size() to the OS
> > specific thread_native_entry(). We can call this right away in the
> > newly born thread, since it relies on nothing else.
> >
> > 2 leave the rest in place and rename the function to something like
> > "Thread::finish_initialization()"
> >
> > 3 move the call to Thread::finish_initialization() up out of the
> > <XXXThread>::run() functions to thread_native_entry() - just before it
> > calls Thread::run().
> >
> > 4 Alternative: to (3), make Thread::run() a non-virtual public
> > function - a place for os-generic and thread-class-generic common
> > stuff to-be-run-before-run(). Add a second, protected method, virtual
> > Thread::do_run() = 0, which is the real function child classes should
> > override.
> >
> > BTW I remember now why I did move the stack base/size initialization
> > on AIX up to the start of thread_native_entry. We had crashes in newly
> > spawned child threads and no usable hs-err files since to print a
> > stack trace you need to know the stack dimensions. Another reason to
> > change this.
>
> Well ... a thread should be doing very little interesting in terms of
> crashes prior to calling run() and setting up the stack information, and
> there will always be somewhere it can crash before this is set up, so
> perhaps the error reporter also needs to be bit more cautious about
> assuming things about the thread initialization state and the ability to
> use the Thread API.
>
I agree. I think on AIX we are now able to print a stack trace without
knowing the stack dimensions from Thread.
Best Regards, Thomas
> Cheers,
> David
>
> > ..Thomas
> >
> >
> >> Cheers, Thomas
> >>
> >>> Cheers,
> >>> David
> >>>
> >>>>
> >>>> Cheers, Thomas
> >>>>
> >>>>>> Am I missing something here?
> >>>>>
> >>>>> I guess we will find out :)
> >>>>>
> >>>>> Cheers,
> >>>>> David
> >>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Thomas
> >>>>>>
More information about the hotspot-runtime-dev
mailing list