RFR(s): 8212173: Thread._stack_base/_stack_size initialized too late for new threads
David Holmes
david.holmes at oracle.com
Tue Oct 23 02:07:40 UTC 2018
Hi Thomas,
This looks good to me - thanks. Setting the stack base and size first up
is more robust all round and as you note happens before the adding to
the threads-list in some cases.
Getting rid of the solaris-specific os::initialize_thread is a good
clean up!
I've run this through our internal tier 1-3 testing again and it all passed.
Thanks,
David
On 23/10/2018 2:56 AM, Thomas Stüfe wrote:
> Dear all,
>
> may I have please reviews for the following fix:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8212173
> cr: http://cr.openjdk.java.net/~stuefe/webrevs/8212173-thread-stack-and-base-initialized-too-late/webrev.02/webrev/
>
> See also prior mail discussions: [1]
>
> Wwhen iterating the thread list, threads may be visible which have
> stack base/size not yet set. Calling stack_base() will yield NULL,
> stack_size() 0 or it will assert. The reason for this is that there
> exist a race: Thread::_stack_base and ::_stack_size are initialized
> too late, when the Thread has already been added to the threads list.
>
> My solution to this problem is basically to move the initialization of
> Thread::_stack_base and ::_stack_size to an earlier point at which the
> thread has not yet been added to the threads list.
>
> Unfortunately, this is more difficult that it sounds. Retrieving
> thread stack base and size needs to be done from within the thread in
> question, and only on Linux and BSD we have the opportunity to execute
> code in the new child thread before its Thread object is added to the
> thread list. So, my patch solves the problem completely only on Linux
> and BSD. On Windows, Aix and Solaris it only minimizes the chance for
> it to happen.
>
> For more details, please see bug description and the discussion in [1].
>
> ----
>
> My changes in detail:
>
> - Thread::record_stack_base_and_size(): I removed all code which needs
> the surrounding Thread object being initialized, or which needs
> Thread::current() to be set. That makes the function safe to be called
> from the very start of the child thread's life (start of
> thread_native_entry()), where nothing is initialized yet.
>
> - I moved the call to Thread::record_stack_base_and_size() out of all
> <ChildThreadClass>::run() functions into the very start of the
> thread_native_entry() on every platform.
>
> - The parts I moved out of Thread::record_stack_base_and_size() are:
> NMT thread stack initialization and Logging. I needed to find a new
> place for them. I wanted them to run at the end of the child thread
> initialization, when the child Thread is already initialized and
> Thread::current is initialized too. So I moved them to just before
> <ChildThreadCLass>::run() is called.
>
> - To prevent code duplication, I changed:
>
> public virtual Thread::run()
>
> to
>
> public (nonvirtual) Thread::call_run();
> protected virtual Thread::run() = 0;
>
> Thread::call_run() now is the place to put common,
> platform-independent and <ChildThreadClass>-independent
> initializations. There I put the NMT stuff and logging. After
> <ChildThreadClass>::run() returns, I put common cleanup checks.
>
> Other changes:
>
> - We had a function os::initialize_thread() which only existed for the
> sole benefit of Solaris, where if gives the platform a stab at
> correcting the stack base, size for a certain corner case when we run
> on primordial thread. On all other platform-cpu-combinations, this
> function was empty. I removed all empty implementations and renamed
> the function to the much more fitting, Solaris only
> "os::Solaris::correct_stack_boundaries_for_primordial_thread(Thread*
> thr)".
>
> - When <ChildClassThread>::run() returns, the Thread object may have
> been deleted by the child class run() function. To avoid errors I set
> the thread pointer to zero (where possible) to prevent accessing it to
> reference members.
>
> - Since Thread::record_stack_base_and_size() does not register the
> thread stack with NMT anymore, I needed to fix those places where the
> former is called for attached threads and add that NMT registration
> manually.
>
> - I made Thread::clear_thread_current() a static function since it
> needs no members of Thread (unlike Thread::initialize_thread_current()
> which needs the this pointer to hook it into TLS). That makes it
> possible to use clear_thread_current() without having a valid Thread
> object, which is the case if <ChildClassThread>::run() deleted the
> Thread.
>
> ----
>
> That's about it.
>
> The change ran through jdk-submit tests without errors.
>
> Thank you for reviewing.
>
> ..Thomas
>
> [1] http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-October/030516.html
>
More information about the hotspot-runtime-dev
mailing list