Question about thread initialization

Fri Oct 12 23:42:13 UTC 2018

Hi Thomas,

On 13/10/2018 3:56 AM, Thomas Stüfe wrote:
> Hi  David,
> 
> thank you for your brain cycles.
> 
> On Fri, Oct 12, 2018 at 12:39 PM David Holmes <david.holmes at oracle.com> wrote:
>>
>> Hi Thomas,
>>
>> On 12/10/2018 6:18 PM, Thomas Stüfe wrote:
>>> Hi all,
>>>
>>> a small question.
>>>
>>> JVM_StartThread calls new JavaThread()
>>>     JavaThread::JavaThread() calls os::create_thread()
>>>       os::create_thread() starts the new thread and waits for the
>>> handshake, then returns
>>>
>>> Back in JVM_StartThread, we call JavaThread::prepare(), which adds the
>>> new thread to the Threads list. By that time, the new thread is
>>> already running, but how far it has gotten is unknown.
>>
>> Right - though the new thread can't enter run() until after it has been
>> prepared and "started". There are two parts to the handshake:
>>
>>       Parent thread             New Thread
>>      start new Thread
>>      wait for new thread
>>                                signal parent
>>                                wait for parent
>>      prepare new thread
>>      "start" new thread
>>      signal new thread
>>                                 run()
>>
> 
> Ah, I see. The new thread is taken off the leash only at the end of
> JVM_StartThread(), when Thread::start() is called and the final part
> of the handshake is completed.
> 
>>> The new thread's stack dimensions are set from within Thread::run()
>>> (for some reason, every child class does this on its own?) by calling
>>> Thread::record_stack_base_and_size(). So, after the handshake with its
>>> parent thread. Why?
>>
>> Good question. Undoubtedly historical. :)
>>
>>> This means we have a little race: in the Threads list there may be
>>> threads which have been just created and Thread::run() did not yet get
>>> around to set the stack size. In tests I stumbled over this, very
>>> rarely, when iterating Threads to check the stack sizes.
>>
>> Hmmm. Threads that are still _thread_new should really be invisible
>> until more fully initialized. How exactly were you iterating the
>> threads? This might be an oversight in the related APIs.
>>
> 
> This was because of a little unrelated test I wrote to accumulate the
> combined thread stack sizes. They did not add up and I was confused.
> 
> I am now less confused:
> 
> In  JVM_StartThread():
> 
> {
>    grab thread list lock
>    new JavaThread()
>       - calls pthread_create, new thread starts and waits for
> handshake. stack base, size still NULL,0.
>    JavaThread::prepare()
>       - thread gets added to Threads list
> a } // relinquish threads lock
> ...
> ...
> Threads::start()
>    - signal new thread to run
> b  - new thread completes handshake, calls Thread::run(), and all
> implementations pretty much immediately set the stackbase/size.
> 
> Between (a) and (b) another thread could grab the Threads lock,
> iterate the threads and would see the new thread with uninitialized
> base/size.
> 
> To prove this I did a little test right there, in JVM_StartThread, after (a):
> 
> +  {
> +    MutexLocker mu(Threads_lock);
> +    MyThreadClosure tc;
> +    Threads::threads_do(&tc);
> +  }
> 
> the thread closure just prints the stack dimensions of the thrad. Sure
> enough, the new thread still has NULL/0 for stack base/size. In fact,
> Thread::stack_base() asserts because of this.
> 
> So, could not another thread happen to do the same in this little time interval?

There _should_ be some rules about how thread closures gather their 
target threads based on the thread state. Part of that should (now that 
I think about it) filter out threads that are still "new" - but whether 
they do or not is a different matter. This may well just be an oversight.

>>> Is there any reason why we could not just call
>>> record_stack_base_and_size() before calling Thread::run(), right at
>>> the start of the native entry function (thread_native_entry in the
>>> case of Linux)?
>>
>> Have you tried it? :)
>>
>> I can't immediately see why this can't happen in the
>> thread_native_entry. It's possible there was once a dependency with
>> something in thread preparation etc, but that's just speculation on my part.
>>
> 
> Funny, I wanted to test it and then it occurred to me that we do this
> all along already on AIX: At the entrace of thread_native_entry I set
> stack base and size (that was even my own code from the initial AIX
> port). Unfortunately I have lost the history for that change and do
> not know anymore why I did this.
> 
> Since we never had problems on AIX I guess this is okay for other
> platforms too - as long os::current_stack_base() /
> current_stack_size() have no side effects and do not rely on anything
> but OS functionality.

Okay so are you going to propose making such a change?

Cheers,
David

> 
> Cheers, Thomas
> 
>>> Am I missing something here?
>>
>> I guess we will find out :)
>>
>> Cheers,
>> David
>>
>>> Thanks,
>>>
>>> Thomas
>>>