RFR(s): 8212173: Thread._stack_base/_stack_size initialized too late for new threads

Thomas Stüfe thomas.stuefe at gmail.com
Thu Oct 25 18:01:35 UTC 2018


Ping...

May I please have  a second review?

Thanks!
On Mon, Oct 22, 2018 at 6:56 PM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>
> Dear all,
>
> may I have please reviews for the following fix:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8212173
> cr: http://cr.openjdk.java.net/~stuefe/webrevs/8212173-thread-stack-and-base-initialized-too-late/webrev.02/webrev/
>
> See also prior mail discussions: [1]
>
> Wwhen iterating the thread list, threads may be visible which have
> stack base/size not yet set. Calling stack_base() will yield NULL,
> stack_size() 0 or it will assert. The reason for this is that there
> exist a race: Thread::_stack_base and ::_stack_size are initialized
> too late, when the Thread has already been added to the threads list.
>
> My solution to this problem is basically to move the initialization of
> Thread::_stack_base and ::_stack_size to an earlier point at which the
> thread has not yet been added to the threads list.
>
> Unfortunately, this is more difficult that it sounds. Retrieving
> thread stack base and size needs to be done from within the thread in
> question, and only on Linux and BSD we have the opportunity to execute
> code in the new child thread before its Thread object is added to the
> thread list. So, my patch solves the problem completely only on Linux
> and BSD. On Windows, Aix and Solaris it only minimizes the chance for
> it to happen.
>
> For more details, please see bug description and the discussion in [1].
>
> ----
>
> My changes in detail:
>
> - Thread::record_stack_base_and_size(): I removed all code which needs
> the surrounding Thread object being initialized, or which needs
> Thread::current() to be set. That makes the function safe to be called
> from the very start of the child thread's life (start of
> thread_native_entry()), where nothing is initialized yet.
>
> - I moved the call to Thread::record_stack_base_and_size() out of all
> <ChildThreadClass>::run() functions into the very start of the
> thread_native_entry() on every platform.
>
> - The parts I moved out of Thread::record_stack_base_and_size() are:
> NMT thread stack initialization and Logging. I needed to find a new
> place for them. I wanted them to run at the end of the child thread
> initialization, when the child Thread is already initialized and
> Thread::current is initialized too. So I moved them to just before
> <ChildThreadCLass>::run() is called.
>
> -  To prevent code duplication, I changed:
>
> public virtual Thread::run()
>
> to
>
> public (nonvirtual) Thread::call_run();
> protected virtual Thread::run() = 0;
>
> Thread::call_run() now is the place to put common,
> platform-independent and <ChildThreadClass>-independent
> initializations. There I put the NMT stuff and logging. After
> <ChildThreadClass>::run() returns, I put common cleanup checks.
>
> Other changes:
>
> - We had a function os::initialize_thread() which only existed for the
> sole benefit of Solaris, where if gives the platform a stab at
> correcting the stack base, size for a certain corner case when we run
> on primordial thread. On all other platform-cpu-combinations, this
> function was empty. I removed all empty implementations and renamed
> the function to the much more fitting, Solaris only
> "os::Solaris::correct_stack_boundaries_for_primordial_thread(Thread*
> thr)".
>
> - When <ChildClassThread>::run() returns, the Thread object may have
> been deleted by the child class run() function. To avoid errors I set
> the thread pointer to zero (where possible) to prevent accessing it to
> reference members.
>
> - Since Thread::record_stack_base_and_size() does not register the
> thread stack with NMT anymore, I needed to fix those places where the
> former is called for attached threads and add that NMT registration
> manually.
>
> - I made Thread::clear_thread_current() a static function since it
> needs no members of Thread (unlike Thread::initialize_thread_current()
> which needs the this pointer to hook it into TLS). That makes it
> possible to use clear_thread_current() without having a valid Thread
> object, which is the case if <ChildClassThread>::run() deleted the
> Thread.
>
> ----
>
> That's about it.
>
> The change ran through jdk-submit tests without errors.
>
> Thank you for reviewing.
>
> ..Thomas
>
> [1] http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-October/030516.html


More information about the hotspot-runtime-dev mailing list