RFR(s): 8212173: Thread._stack_base/_stack_size initialized too late for new threads

Thomas Stüfe thomas.stuefe at gmail.com
Fri Oct 26 12:35:24 UTC 2018


Hi Volker,

thanks for the review!

On Fri, Oct 26, 2018 at 11:45 AM Volker Simonis
<volker.simonis at gmail.com> wrote:
>
> HI Thomas,
>
> your change looks good. Please find some minor comments below:
>
> os_solaris.cpp/os_windows.cpp
>  - why don't you set "thread" to null after the return from call_run()
> like on the other platforms? Or maybe the comment is enough and you
> don't have to set it to null at all?

Because on those two platforms, "thread" is used some lines down in a
pointer comparison for some counter thing:

if (thread != VMThread::vm_thread()...

I am not really sure if those counters are actually needed, but I did
not dwelve into it, I wanted to keep the patch small and concise. I
think at least on windows they can probably be removed.

Thanks, Thomas

> There's no need to do a new
> webrev for this.
>
> Thanks,
> Volker
> On Thu, Oct 25, 2018 at 8:02 PM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
> >
> > Ping...
> >
> > May I please have  a second review?
> >
> > Thanks!
> > On Mon, Oct 22, 2018 at 6:56 PM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
> > >
> > > Dear all,
> > >
> > > may I have please reviews for the following fix:
> > >
> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8212173
> > > cr: http://cr.openjdk.java.net/~stuefe/webrevs/8212173-thread-stack-and-base-initialized-too-late/webrev.02/webrev/
> > >
> > > See also prior mail discussions: [1]
> > >
> > > Wwhen iterating the thread list, threads may be visible which have
> > > stack base/size not yet set. Calling stack_base() will yield NULL,
> > > stack_size() 0 or it will assert. The reason for this is that there
> > > exist a race: Thread::_stack_base and ::_stack_size are initialized
> > > too late, when the Thread has already been added to the threads list.
> > >
> > > My solution to this problem is basically to move the initialization of
> > > Thread::_stack_base and ::_stack_size to an earlier point at which the
> > > thread has not yet been added to the threads list.
> > >
> > > Unfortunately, this is more difficult that it sounds. Retrieving
> > > thread stack base and size needs to be done from within the thread in
> > > question, and only on Linux and BSD we have the opportunity to execute
> > > code in the new child thread before its Thread object is added to the
> > > thread list. So, my patch solves the problem completely only on Linux
> > > and BSD. On Windows, Aix and Solaris it only minimizes the chance for
> > > it to happen.
> > >
> > > For more details, please see bug description and the discussion in [1].
> > >
> > > ----
> > >
> > > My changes in detail:
> > >
> > > - Thread::record_stack_base_and_size(): I removed all code which needs
> > > the surrounding Thread object being initialized, or which needs
> > > Thread::current() to be set. That makes the function safe to be called
> > > from the very start of the child thread's life (start of
> > > thread_native_entry()), where nothing is initialized yet.
> > >
> > > - I moved the call to Thread::record_stack_base_and_size() out of all
> > > <ChildThreadClass>::run() functions into the very start of the
> > > thread_native_entry() on every platform.
> > >
> > > - The parts I moved out of Thread::record_stack_base_and_size() are:
> > > NMT thread stack initialization and Logging. I needed to find a new
> > > place for them. I wanted them to run at the end of the child thread
> > > initialization, when the child Thread is already initialized and
> > > Thread::current is initialized too. So I moved them to just before
> > > <ChildThreadCLass>::run() is called.
> > >
> > > -  To prevent code duplication, I changed:
> > >
> > > public virtual Thread::run()
> > >
> > > to
> > >
> > > public (nonvirtual) Thread::call_run();
> > > protected virtual Thread::run() = 0;
> > >
> > > Thread::call_run() now is the place to put common,
> > > platform-independent and <ChildThreadClass>-independent
> > > initializations. There I put the NMT stuff and logging. After
> > > <ChildThreadClass>::run() returns, I put common cleanup checks.
> > >
> > > Other changes:
> > >
> > > - We had a function os::initialize_thread() which only existed for the
> > > sole benefit of Solaris, where if gives the platform a stab at
> > > correcting the stack base, size for a certain corner case when we run
> > > on primordial thread. On all other platform-cpu-combinations, this
> > > function was empty. I removed all empty implementations and renamed
> > > the function to the much more fitting, Solaris only
> > > "os::Solaris::correct_stack_boundaries_for_primordial_thread(Thread*
> > > thr)".
> > >
> > > - When <ChildClassThread>::run() returns, the Thread object may have
> > > been deleted by the child class run() function. To avoid errors I set
> > > the thread pointer to zero (where possible) to prevent accessing it to
> > > reference members.
> > >
> > > - Since Thread::record_stack_base_and_size() does not register the
> > > thread stack with NMT anymore, I needed to fix those places where the
> > > former is called for attached threads and add that NMT registration
> > > manually.
> > >
> > > - I made Thread::clear_thread_current() a static function since it
> > > needs no members of Thread (unlike Thread::initialize_thread_current()
> > > which needs the this pointer to hook it into TLS). That makes it
> > > possible to use clear_thread_current() without having a valid Thread
> > > object, which is the case if <ChildClassThread>::run() deleted the
> > > Thread.
> > >
> > > ----
> > >
> > > That's about it.
> > >
> > > The change ran through jdk-submit tests without errors.
> > >
> > > Thank you for reviewing.
> > >
> > > ..Thomas
> > >
> > > [1] http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-October/030516.html


More information about the hotspot-runtime-dev mailing list