build openjdk with fsanitizer=thread

Dmitry Vyukov dvyukov at google.com
Thu Feb 27 06:21:48 UTC 2020


On Thu, Feb 27, 2020 at 4:45 AM Jie He <Jie.He at arm.com> wrote:
>
> Yes, you remind me, I ever added a suppression in supp file "call_from_lib:libjvm.so" to try to get a clear report,
> but seems it didn't succeed. There are still a lot of reports produced,  some threads have stack trace and recognize pthread function well, but others not,

Oh, if the JVM is compiled to libjvm.so, this is not going to work,
this literally tells tsan to ignore all pthread interceptors coming
from libjvm.so.
call_from_lib is meant only for non-instrumented libraries, because
compiler instrumentation is not ignored. So what we have with this
suppression is that all synchronization (pthread) is ignored but
memory accesses are not. This is intended that this will produce tons
of nonsensical reports.
And even for non-instrumented libraries call_from_lib may lead to
false positives if the non-instrumented library synchronizes any of
instrumented code.
call_from_lib is meant for very special contexts, generally don't use it.


> is it correct behavior? I think no reports should be produced if I add this strong suppression.

Yes. And it's not strong in all respects, only in some.


> Then I removed the supp item,  seems Mutex could be handled well, but still a lot of reports exist, I will check and suppress them if it's not a race.

Tsan is not supposed to produce any false positives according to C/C++
rules (if used correctly and understand synchronization, which should
be the case here as JVM seem to use pthread synchronization
primitives). Suppressions are mostly to temporarily suppress known
bugs for example to make CI green while somebody works on the bug fix.



> Thank u.
>
> -----Original Message-----
> From: Dmitry Vyukov <dvyukov at google.com>
> Sent: Thursday, February 27, 2020 12:12 AM
> To: Jie He <Jie.He at arm.com>
> Cc: Arthur Eubanks <aeubanks at google.com>; tsan-dev at openjdk.java.net; nd <nd at arm.com>; thread-sanitizer <thread-sanitizer at googlegroups.com>
> Subject: Re: build openjdk with fsanitizer=thread
>
> This is puzzling. It seems to be protected by the PlatformMutex, which is pthread_mutex_t underneath:
> https://github.com/openjdk/jdk/blob/0da0333a06aef32ce7af3448dfa38c1f40c32826/src/hotspot/os/posix/os_posix.inline.hpp
>
> Tsan does understand pthread_mutex_t, that's a very widely used functionality. So there is something fishy going on.
> Do we know that it intercepts/understands at least some of pthread mutexes? Or there is some systematic problem which results in tsan missing just all of pthread? By any chance don't you have something like "called_from_lib:pthread" in your tsan.supp?
>
> There is also something fishy in the reports  you posted. First one happens inside of MutexLocker::~MutexLocker(). The question is: why the race wasn't reported in the MutexLocker constructor where we write to owner_ as well? I don't see any reasonable execution where write in ctor does not race, but write in dtor races...
> The same in the second report: the race is in notify method, but we also called lock before which write to owner_. Why that did not race?...
>
>
>
>
> On Wed, Feb 26, 2020 at 3:11 PM Jie He <Jie.He at arm.com> wrote:
> >
> > Oh, I misunderstand the race object,
> > but _owner also appears to be protected by the member var _lock within class Mutex.
> >
> > -----Original Message-----
> > From: Dmitry Vyukov <dvyukov at google.com>
> > Sent: Wednesday, February 26, 2020 9:56 PM
> > To: Jie He <Jie.He at arm.com>
> > Cc: Arthur Eubanks <aeubanks at google.com>; tsan-dev at openjdk.java.net;
> > nd <nd at arm.com>; thread-sanitizer <thread-sanitizer at googlegroups.com>
> > Subject: Re: build openjdk with fsanitizer=thread
> >
> > On Wed, Feb 26, 2020 at 2:46 PM Jie He <Jie.He at arm.com> wrote:
> > >
> > > Is this case you mentioned "_owned_by_self"
> > >
> > > >     #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966)
> > > >     #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa)
> > > >     #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a)
> > > >     #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0)
> > > >     #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11)
> > > >     #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058)
> > > >     #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84)
> > > >     #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee)
> > > >     #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb)
> > > >     #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef)
> > > >     #10 JavaMain
> > >
> > > Seems thread::current() is a TLS variable, refer to the code:
> > >
> > > inline Thread* Thread::current_or_null() { #ifndef
> > > USE_LIBRARY_BASED_TLS_ONLY
> > >   return _thr_current;
> > > #else
> > >   if (ThreadLocalStorage::is_initialized()) {
> > >     return ThreadLocalStorage::thread();
> > >   }
> > >   return NULL;
> > > #endif
> > > }
> >
> > The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race.


More information about the tsan-dev mailing list