RFR: 8241296: Segfault in JNIHandleBlock::oops_do()
Stefan Karlsson
stefan.karlsson at oracle.com
Fri Mar 20 08:35:01 UTC 2020
Hi Andrew,
Thanks for clarifying where and why this failed!
StefanK
On 2020-03-19 17:47, Andrew Haley wrote:
> Hi,
>
> On 3/19/20 3:22 PM, Stefan Karlsson wrote:
>
>> I think the fix is fine.
> OK, thanks.
>
> > Would you mind sharing some extra info? For example the stack trace
>> of the scanned thread, and / or flags used to provoke this? I would
>> like to know why we haven't seen this before.
> Sure.
>
> #0 0x00007ffff7dafb02 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib
> #1 0x00007ffff77533fb in os::PlatformEvent::park (this=0x7ffff0ab690
> #2 0x00007ffff7706805 in ParkCommon (timo=0, ev=0x7ffff0ab6900)
> #3 Monitor::ILock (this=this at entry=0x7ffff0005b30, Self=Self at entry=0
> #4 0x00007ffff7706ffa in Monitor::lock_without_safepoint_check (Self
> #5 Monitor::lock_without_safepoint_check (this=0x7ffff0005b30)
> #6 0x00007ffff77e7f71 in SafepointSynchronize::block (thread=0x7ffff
> #7 0x00007ffff77e6afa in SafepointSynchronize::block (thread=thread@
> #8 0x00007ffff78fd897 in ThreadStateTransition::transition_and_fence
> #9 JavaThread::run (this=0x7ffff0ab5800)
> #10 0x00007ffff7747d78 in java_start (thread=0x7ffff0ab5800)
> #11 0x00007ffff7da9472 in start_thread () from /lib64/libpthread.so.0
> #12 0x00007ffff7ee5063 in clone () from /lib64/libc.so.6
>
> The thread blocked in transition_and_fence() here: note this is in JDK
> 8, but it hasn't changed AFAICS:
>
> // The first routine called by a new Java thread
> void JavaThread::run() {
> // initialize thread-local alloc buffer related fields
> this->initialize_tlab();
>
> // used to test validitity of stack trace backs
> this->record_base_of_stack_pointer();
>
> // Record real stack base and size.
> this->record_stack_base_and_size();
>
> // Initialize thread local storage; set before calling MutexLocker
> this->initialize_thread_local_storage();
>
> this->create_stack_guard_pages();
>
> this->cache_global_variables();
>
> // Thread is now sufficient initialized to be handled by the safepoint code as being
> // in the VM. Change thread state from _thread_new to _thread_in_vm
> =>ThreadStateTransition::transition_and_fence(this, _thread_new, _thread_in_vm);
>
> assert(JavaThread::current() == this, "sanity check");
> assert(!Thread::current()->owns_locks(), "sanity check");
>
> DTRACE_THREAD_PROBE(start, this);
>
> // This operation might block. We call that after all safepoint checks for a new thread has
> // been completed.
> this->set_active_handles(JNIHandleBlock::allocate_block());
>
> So it's pretty obvious why active_handles wasn't set yet. This code
> isn't obviously different from that in jdk/jdk, but I have not been
> able to reproduce the bug there. IMO, though, it's still a bug in
> jdk/jdk.
>
> The most likely reason we haven't seen this before is that
> JNIHandleBlock::oops_do() looks like this:
>
> void JNIHandleBlock::oops_do(OopClosure* f) {
> JNIHandleBlock* current_chain = this;
> while (current_chain != NULL) {
> ...
> }
>
> A sufficiently adversarial compiler can turn this into
>
> void JNIHandleBlock::oops_do(OopClosure* f) {
> JNIHandleBlock* current_chain = this;
> do {
> ...
> } while (current_chain != NULL)
>
> because "this" can never be null in a member function. GCC sometimes
> does this transformation.
>
More information about the hotspot-gc-dev
mailing list