Question about thread initialization

Sat Oct 13 07:49:41 UTC 2018

On Sat, Oct 13, 2018 at 8:36 AM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>
> Hi David,
>
> On Sat, Oct 13, 2018 at 1:42 AM David Holmes <david.holmes at oracle.com> wrote:
> >
> > Hi Thomas,
> >
> > On 13/10/2018 3:56 AM, Thomas Stüfe wrote:
> > > Hi  David,
> > >
> > > thank you for your brain cycles.
> > >
> > > On Fri, Oct 12, 2018 at 12:39 PM David Holmes <david.holmes at oracle.com> wrote:
> > >>
> > >> Hi Thomas,
> > >>
> > >> On 12/10/2018 6:18 PM, Thomas Stüfe wrote:
> > >>> Hi all,
> > >>>
> > >>> a small question.
> > >>>
> > >>> JVM_StartThread calls new JavaThread()
> > >>>     JavaThread::JavaThread() calls os::create_thread()
> > >>>       os::create_thread() starts the new thread and waits for the
> > >>> handshake, then returns
> > >>>
> > >>> Back in JVM_StartThread, we call JavaThread::prepare(), which adds the
> > >>> new thread to the Threads list. By that time, the new thread is
> > >>> already running, but how far it has gotten is unknown.
> > >>
> > >> Right - though the new thread can't enter run() until after it has been
> > >> prepared and "started". There are two parts to the handshake:
> > >>
> > >>       Parent thread             New Thread
> > >>      start new Thread
> > >>      wait for new thread
> > >>                                signal parent
> > >>                                wait for parent
> > >>      prepare new thread
> > >>      "start" new thread
> > >>      signal new thread
> > >>                                 run()
> > >>
> > >
> > > Ah, I see. The new thread is taken off the leash only at the end of
> > > JVM_StartThread(), when Thread::start() is called and the final part
> > > of the handshake is completed.
> > >
> > >>> The new thread's stack dimensions are set from within Thread::run()
> > >>> (for some reason, every child class does this on its own?) by calling
> > >>> Thread::record_stack_base_and_size(). So, after the handshake with its
> > >>> parent thread. Why?
> > >>
> > >> Good question. Undoubtedly historical. :)
> > >>
> > >>> This means we have a little race: in the Threads list there may be
> > >>> threads which have been just created and Thread::run() did not yet get
> > >>> around to set the stack size. In tests I stumbled over this, very
> > >>> rarely, when iterating Threads to check the stack sizes.
> > >>
> > >> Hmmm. Threads that are still _thread_new should really be invisible
> > >> until more fully initialized. How exactly were you iterating the
> > >> threads? This might be an oversight in the related APIs.
> > >>
> > >
> > > This was because of a little unrelated test I wrote to accumulate the
> > > combined thread stack sizes. They did not add up and I was confused.
> > >
> > > I am now less confused:
> > >
> > > In  JVM_StartThread():
> > >
> > > {
> > >    grab thread list lock
> > >    new JavaThread()
> > >       - calls pthread_create, new thread starts and waits for
> > > handshake. stack base, size still NULL,0.
> > >    JavaThread::prepare()
> > >       - thread gets added to Threads list
> > > a } // relinquish threads lock
> > > ...
> > > ...
> > > Threads::start()
> > >    - signal new thread to run
> > > b  - new thread completes handshake, calls Thread::run(), and all
> > > implementations pretty much immediately set the stackbase/size.
> > >
> > > Between (a) and (b) another thread could grab the Threads lock,
> > > iterate the threads and would see the new thread with uninitialized
> > > base/size.
> > >
> > > To prove this I did a little test right there, in JVM_StartThread, after (a):
> > >
> > > +  {
> > > +    MutexLocker mu(Threads_lock);
> > > +    MyThreadClosure tc;
> > > +    Threads::threads_do(&tc);
> > > +  }
> > >
> > > the thread closure just prints the stack dimensions of the thrad. Sure
> > > enough, the new thread still has NULL/0 for stack base/size. In fact,
> > > Thread::stack_base() asserts because of this.
> > >
> > > So, could not another thread happen to do the same in this little time interval?
> >
> > There _should_ be some rules about how thread closures gather their
> > target threads based on the thread state. Part of that should (now that
> > I think about it) filter out threads that are still "new" - but whether
> > they do or not is a different matter. This may well just be an oversight.
> >
>
> I think so too.
>
> > >>> Is there any reason why we could not just call
> > >>> record_stack_base_and_size() before calling Thread::run(), right at
> > >>> the start of the native entry function (thread_native_entry in the
> > >>> case of Linux)?
> > >>
> > >> Have you tried it? :)
> > >>
> > >> I can't immediately see why this can't happen in the
> > >> thread_native_entry. It's possible there was once a dependency with
> > >> something in thread preparation etc, but that's just speculation on my part.
> > >>
> > >
> > > Funny, I wanted to test it and then it occurred to me that we do this
> > > all along already on AIX: At the entrace of thread_native_entry I set
> > > stack base and size (that was even my own code from the initial AIX
> > > port). Unfortunately I have lost the history for that change and do
> > > not know anymore why I did this.
> > >
> > > Since we never had problems on AIX I guess this is okay for other
> > > platforms too - as long os::current_stack_base() /
> > > current_stack_size() have no side effects and do not rely on anything
> > > but OS functionality.
> >
> > Okay so are you going to propose making such a change?
> >
>
> Yes. I'll prepare a webrev.
>

Yikes Thread::record_stack_base_and_size() is complex. That thing's
name is a blatant lie, it does way more than that, especially on
Solaris.

I feel unsure about moving this up to the start of the native entry
function, since it is not aware of its surrounding Thread class being
only half initialized.

On AIX, we just call "set_stack_base" and "set_stack_size" directly,
and later, in shared code, run thru
Thread::record_stack_base_and_size() like everyone else. So, we just
pre-initialized base and size. I do not want to do this for all
platforms, since this is ugly.

I wonder whether a better solution would be to change
Thread::record_stack_base_and_size():

1 move the calls to set_stack_base() and set_stack_size() to the OS
specific thread_native_entry(). We can call this right away in the
newly born thread, since it relies on nothing else.

2 leave the rest in place and rename the function to something like
"Thread::finish_initialization()"

3 move the call to Thread::finish_initialization() up out of the
<XXXThread>::run() functions to thread_native_entry() - just before it
calls Thread::run().

4 Alternative: to (3), make Thread::run() a non-virtual public
function - a place for os-generic and thread-class-generic common
stuff to-be-run-before-run(). Add a second, protected method, virtual
Thread::do_run() = 0, which is the real function child classes should
override.

BTW I remember now why I did move the stack base/size initialization
on AIX up to the start of thread_native_entry. We had crashes in newly
spawned child threads and no usable hs-err files since to print a
stack trace you need to know the stack dimensions. Another reason to
change this.

..Thomas

> Cheers, Thomas
>
> > Cheers,
> > David
> >
> > >
> > > Cheers, Thomas
> > >
> > >>> Am I missing something here?
> > >>
> > >> I guess we will find out :)
> > >>
> > >> Cheers,
> > >> David
> > >>
> > >>> Thanks,
> > >>>
> > >>> Thomas
> > >>>