From Jie.He at arm.com Tue Feb 25 05:20:25 2020 From: Jie.He at arm.com (Jie He) Date: Tue, 25 Feb 2020 05:20:25 +0000 Subject: build openjdk with fsanitizer=thread Message-ID: Hi I built openjdk with enabling fanitizer=thread recently, and got a lot of warning by tsan even a helloworld case. Then I investigated around more than 15 warnings, found they could be divided into 3 classes: 1. Benign races, commonly, there is a comment to indicate why it is safe in MP. 2. Runtime atomic implementation, in x86, the atomic load and store will be translated to platformload/store. 3. Runtime function implement protected by MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. For 3, I couldn't understand why tsan couldn't recognize that it's safe and protected by a lock. In TSAN document, it said pthread functions are supported. So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to mark, then got the warning, double of lock, it seems tsan knows MutexLocker is a lock. Or because one of the conflicting threads lost its stack, in this kind of warning, there is one out of the two threads fails to restore its stack. It may result that tsan only knows the thread calls read/write, but doesn't know the memory operation is protected by a lock. These threads couldn't restore the stack are JIT threads/Java threads? I need to fix the tsan symbolizer function first for this situation? Thanks Jie He From dvyukov at google.com Tue Feb 25 05:29:55 2020 From: dvyukov at google.com (Dmitry Vyukov) Date: Tue, 25 Feb 2020 06:29:55 +0100 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: On Tue, Feb 25, 2020 at 6:20 AM Jie He wrote: > > Hi > > I built openjdk with enabling fanitizer=thread recently, and got a lot of warning by tsan even a helloworld case. > > Then I investigated around more than 15 warnings, found they could be divided into 3 classes: > > > 1. Benign races, commonly, there is a comment to indicate why it is safe in MP. +thread-sanitizer mailing list Hi Jie, C++ standard still calls this data race and renders behavior of the program as undefined. Comments don't fix bugs ;) > 2. Runtime atomic implementation, in x86, the atomic load and store will be translated to platformload/store. I assume here platformload/store are implemented as plain loads and stores. These may need to be changed at least in tsan build (but maybe in all builds, because see the previous point). I am not aware of the openjdk portability requirements, but today the __atomic_load/store_n intrinsics may be a good choice. > 3. Runtime function implement protected by MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. > > For 3, I couldn't understand why tsan couldn't recognize that it's safe and protected by a lock. In TSAN document, it said pthread functions are supported. > So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to mark, then got the warning, double of lock, it seems tsan knows MutexLocker is a lock. > > Or because one of the conflicting threads lost its stack, in this kind of warning, there is one out of the two threads fails to restore its stack. > It may result that tsan only knows the thread calls read/write, but doesn't know the memory operation is protected by a lock. > These threads couldn't restore the stack are JIT threads/Java threads? I need to fix the tsan symbolizer function first for this situation? Yes, tsan understands pthread mutex natively, no annotations required. You may try to increase history_size flag to get the second stack: https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags "failed to restore stack trace" should not lead to false positives either. Please post several full tsan reports and links to the corresponding source code. From Jie.He at arm.com Tue Feb 25 06:19:10 2020 From: Jie.He at arm.com (Jie He) Date: Tue, 25 Feb 2020 06:19:10 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: Hi Dmitry Yes, so I don't think the first 2 classes of warnings are data race, they are out of scope of tsan. Currently, I'm not sure if the second thread is JIT thread. But seems the second thread knows a tsan_read8 behavior happened at least in IR level. like the following tsan reports, I have changed the history_size to 4: WARNING: ThreadSanitizer: data race (pid=9726) Write of size 8 at 0x7b1800003ab0 by thread T1: #0 ChunkPool::free(Chunk*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 (libjvm.so+0x7e6fe0) #1 Chunk::operator delete(void*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:207:54 (libjvm.so+0x7e47f4) #2 Chunk::chop() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:225:5 (libjvm.so+0x7e4994) #3 Arena::destruct_contents() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:319:11 (libjvm.so+0x7e5274) #4 Arena::~Arena() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:283:3 (libjvm.so+0x7e52ea) #5 ResourceArea::~ResourceArea() /home/wave/workspace/jdk_master/src/hotspot/share/memory/resourceArea.hpp:44:7 (libjvm.so+0xae9d18) #6 Thread::~Thread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:449:3 (libjvm.so+0x1e2214a) #7 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1903:1 (libjvm.so+0x1e27af4) #8 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1856:27 (libjvm.so+0x1e27b3c) #9 ThreadsSMRSupport::smr_delete(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/threadSMR.cpp:1027:3 (libjvm.so+0x1e47408) #10 JavaThread::smr_delete() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:208:5 (libjvm.so+0x1e20e73) #11 jni_DetachCurrentThread /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4173:11 (libjvm.so+0x137ae7a) #12 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x67e9) Previous read of size 8 at 0x7b1800003ab0 by thread T14: [failed to restore the stack] Location is heap block of size 88 at 0x7b1800003a80 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 AllocateHeap(unsigned long, MemoryType, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:52:10 (libjvm.so+0x7bfd34) #4 CHeapObj<(MemoryType)8>::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.hpp:193:19 (libjvm.so+0x7e6799) #5 ChunkPool::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:135 (libjvm.so+0x7e6799) #6 chunkpool_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:154:3 (libjvm.so+0x7e4441) #7 vm_init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:102:3 (libjvm.so+0x11c9b6a) #8 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3846:3 (libjvm.so+0x1e3108c) #9 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #10 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #11 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9728, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T14 (tid=9742, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 WatcherThread::WatcherThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1375:7 (libjvm.so+0x1e25399) #3 WatcherThread::start() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1514:9 (libjvm.so+0x1e2598f) #4 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4105:7 (libjvm.so+0x1e31bb1) #5 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #6 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #7 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 in ChunkPool::free(Chunk*) And the openjdk code is below, you could see there is a ThreadCritical which derives from pthread_mutex and implements lock/unlock in the ctor/dtor: // Return a chunk to the pool void free(Chunk* chunk) { assert(chunk->length() + Chunk::aligned_overhead_size() == _size, "bad size"); ThreadCritical tc; _num_used--; // Add chunk to list chunk->set_next(_first); 92: _first = chunk; 93: _num_chunks++; } -----Original Message----- From: Dmitry Vyukov Sent: Tuesday, February 25, 2020 1:30 PM To: Jie He Cc: tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread On Tue, Feb 25, 2020 at 6:20 AM Jie He wrote: > > Hi > > I built openjdk with enabling fanitizer=thread recently, and got a lot of warning by tsan even a helloworld case. > > Then I investigated around more than 15 warnings, found they could be divided into 3 classes: > > > 1. Benign races, commonly, there is a comment to indicate why it is safe in MP. +thread-sanitizer mailing list Hi Jie, C++ standard still calls this data race and renders behavior of the program as undefined. Comments don't fix bugs ;) > 2. Runtime atomic implementation, in x86, the atomic load and store will be translated to platformload/store. I assume here platformload/store are implemented as plain loads and stores. These may need to be changed at least in tsan build (but maybe in all builds, because see the previous point). I am not aware of the openjdk portability requirements, but today the __atomic_load/store_n intrinsics may be a good choice. > 3. Runtime function implement protected by MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. > > For 3, I couldn't understand why tsan couldn't recognize that it's safe and protected by a lock. In TSAN document, it said pthread functions are supported. > So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to mark, then got the warning, double of lock, it seems tsan knows MutexLocker is a lock. > > Or because one of the conflicting threads lost its stack, in this kind of warning, there is one out of the two threads fails to restore its stack. > It may result that tsan only knows the thread calls read/write, but doesn't know the memory operation is protected by a lock. > These threads couldn't restore the stack are JIT threads/Java threads? I need to fix the tsan symbolizer function first for this situation? Yes, tsan understands pthread mutex natively, no annotations required. You may try to increase history_size flag to get the second stack: https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags "failed to restore stack trace" should not lead to false positives either. Please post several full tsan reports and links to the corresponding source code. From Jie.He at arm.com Tue Feb 25 08:08:33 2020 From: Jie.He at arm.com (Jie He) Date: Tue, 25 Feb 2020 08:08:33 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: Hi Dmitry another case that data race exists between javamain thread and G1YoungRemSetSampling Thread, I believe both of them are C++ function threads. But tsan doesn't restore G1 thread stack successfully, and seems to consider there is no lock to protect Mutex's member var _owner. See the following reports by tsan, code could be found in github https://github.com/openjdk/jdk/tree/master/src/hotspot/share: WARNING: ThreadSanitizer: data race (pid=9787) Read of size 8 at 0x7b7c00002360 by thread T1: #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) #10 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x681b) Previous write of size 8 at 0x7b7c00002360 by thread T6: [failed to restore the stack] Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #13 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9789, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T6 (tid=9795, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #12 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 in Mutex::owned_by_self() const B.R Jie He -----Original Message----- From: tsan-dev On Behalf Of Jie He Sent: Tuesday, February 25, 2020 2:19 PM To: Dmitry Vyukov Cc: nd ; thread-sanitizer ; tsan-dev at openjdk.java.net Subject: RE: build openjdk with fsanitizer=thread Hi Dmitry Yes, so I don't think the first 2 classes of warnings are data race, they are out of scope of tsan. Currently, I'm not sure if the second thread is JIT thread. But seems the second thread knows a tsan_read8 behavior happened at least in IR level. like the following tsan reports, I have changed the history_size to 4: WARNING: ThreadSanitizer: data race (pid=9726) Write of size 8 at 0x7b1800003ab0 by thread T1: #0 ChunkPool::free(Chunk*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 (libjvm.so+0x7e6fe0) #1 Chunk::operator delete(void*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:207:54 (libjvm.so+0x7e47f4) #2 Chunk::chop() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:225:5 (libjvm.so+0x7e4994) #3 Arena::destruct_contents() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:319:11 (libjvm.so+0x7e5274) #4 Arena::~Arena() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:283:3 (libjvm.so+0x7e52ea) #5 ResourceArea::~ResourceArea() /home/wave/workspace/jdk_master/src/hotspot/share/memory/resourceArea.hpp:44:7 (libjvm.so+0xae9d18) #6 Thread::~Thread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:449:3 (libjvm.so+0x1e2214a) #7 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1903:1 (libjvm.so+0x1e27af4) #8 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1856:27 (libjvm.so+0x1e27b3c) #9 ThreadsSMRSupport::smr_delete(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/threadSMR.cpp:1027:3 (libjvm.so+0x1e47408) #10 JavaThread::smr_delete() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:208:5 (libjvm.so+0x1e20e73) #11 jni_DetachCurrentThread /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4173:11 (libjvm.so+0x137ae7a) #12 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x67e9) Previous read of size 8 at 0x7b1800003ab0 by thread T14: [failed to restore the stack] Location is heap block of size 88 at 0x7b1800003a80 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 AllocateHeap(unsigned long, MemoryType, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:52:10 (libjvm.so+0x7bfd34) #4 CHeapObj<(MemoryType)8>::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.hpp:193:19 (libjvm.so+0x7e6799) #5 ChunkPool::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:135 (libjvm.so+0x7e6799) #6 chunkpool_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:154:3 (libjvm.so+0x7e4441) #7 vm_init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:102:3 (libjvm.so+0x11c9b6a) #8 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3846:3 (libjvm.so+0x1e3108c) #9 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #10 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #11 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9728, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T14 (tid=9742, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 WatcherThread::WatcherThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1375:7 (libjvm.so+0x1e25399) #3 WatcherThread::start() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1514:9 (libjvm.so+0x1e2598f) #4 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4105:7 (libjvm.so+0x1e31bb1) #5 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #6 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #7 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 in ChunkPool::free(Chunk*) And the openjdk code is below, you could see there is a ThreadCritical which derives from pthread_mutex and implements lock/unlock in the ctor/dtor: // Return a chunk to the pool void free(Chunk* chunk) { assert(chunk->length() + Chunk::aligned_overhead_size() == _size, "bad size"); ThreadCritical tc; _num_used--; // Add chunk to list chunk->set_next(_first); 92: _first = chunk; 93: _num_chunks++; } -----Original Message----- From: Dmitry Vyukov Sent: Tuesday, February 25, 2020 1:30 PM To: Jie He Cc: tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread On Tue, Feb 25, 2020 at 6:20 AM Jie He wrote: > > Hi > > I built openjdk with enabling fanitizer=thread recently, and got a lot of warning by tsan even a helloworld case. > > Then I investigated around more than 15 warnings, found they could be divided into 3 classes: > > > 1. Benign races, commonly, there is a comment to indicate why it is safe in MP. +thread-sanitizer mailing list Hi Jie, C++ standard still calls this data race and renders behavior of the program as undefined. Comments don't fix bugs ;) > 2. Runtime atomic implementation, in x86, the atomic load and store will be translated to platformload/store. I assume here platformload/store are implemented as plain loads and stores. These may need to be changed at least in tsan build (but maybe in all builds, because see the previous point). I am not aware of the openjdk portability requirements, but today the __atomic_load/store_n intrinsics may be a good choice. > 3. Runtime function implement protected by MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. > > For 3, I couldn't understand why tsan couldn't recognize that it's safe and protected by a lock. In TSAN document, it said pthread functions are supported. > So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to mark, then got the warning, double of lock, it seems tsan knows MutexLocker is a lock. > > Or because one of the conflicting threads lost its stack, in this kind of warning, there is one out of the two threads fails to restore its stack. > It may result that tsan only knows the thread calls read/write, but doesn't know the memory operation is protected by a lock. > These threads couldn't restore the stack are JIT threads/Java threads? I need to fix the tsan symbolizer function first for this situation? Yes, tsan understands pthread mutex natively, no annotations required. You may try to increase history_size flag to get the second stack: https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags "failed to restore stack trace" should not lead to false positives either. Please post several full tsan reports and links to the corresponding source code. From Jie.He at arm.com Tue Feb 25 08:16:20 2020 From: Jie.He at arm.com (Jie He) Date: Tue, 25 Feb 2020 08:16:20 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: Add more report for the previous case, false race between javamain thread and g1 sampling thread. WARNING: ThreadSanitizer: data race (pid=9770) Read of size 8 at 0x7b7c00002360 by thread T1: #0 Mutex::assert_owner(Thread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 (libjvm.so+0x1924838) #1 Monitor::notify() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:166:3 (libjvm.so+0x1924c1f) #2 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:128:12 (libjvm.so+0x10c50d7) #3 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) #4 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) #5 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) #6 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) #7 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) #8 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) #9 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x681b) Previous write of size 8 at 0x7b7c00002360 by thread T6: [failed to restore the stack] Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #13 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9772, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T6 (tid=9778, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #12 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 in Mutex::assert_owner(Thread*) B.R Jie He -----Original Message----- From: Jie He Sent: Tuesday, February 25, 2020 4:09 PM To: Jie He ; Dmitry Vyukov Cc: nd ; thread-sanitizer ; tsan-dev at openjdk.java.net Subject: RE: build openjdk with fsanitizer=thread Hi Dmitry another case that data race exists between javamain thread and G1YoungRemSetSampling Thread, I believe both of them are C++ function threads. But tsan doesn't restore G1 thread stack successfully, and seems to consider there is no lock to protect Mutex's member var _owner. See the following reports by tsan, code could be found in github https://github.com/openjdk/jdk/tree/master/src/hotspot/share: WARNING: ThreadSanitizer: data race (pid=9787) Read of size 8 at 0x7b7c00002360 by thread T1: #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) #10 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x681b) Previous write of size 8 at 0x7b7c00002360 by thread T6: [failed to restore the stack] Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #13 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9789, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T6 (tid=9795, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #12 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 in Mutex::owned_by_self() const B.R Jie He -----Original Message----- From: tsan-dev On Behalf Of Jie He Sent: Tuesday, February 25, 2020 2:19 PM To: Dmitry Vyukov Cc: nd ; thread-sanitizer ; tsan-dev at openjdk.java.net Subject: RE: build openjdk with fsanitizer=thread Hi Dmitry Yes, so I don't think the first 2 classes of warnings are data race, they are out of scope of tsan. Currently, I'm not sure if the second thread is JIT thread. But seems the second thread knows a tsan_read8 behavior happened at least in IR level. like the following tsan reports, I have changed the history_size to 4: WARNING: ThreadSanitizer: data race (pid=9726) Write of size 8 at 0x7b1800003ab0 by thread T1: #0 ChunkPool::free(Chunk*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 (libjvm.so+0x7e6fe0) #1 Chunk::operator delete(void*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:207:54 (libjvm.so+0x7e47f4) #2 Chunk::chop() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:225:5 (libjvm.so+0x7e4994) #3 Arena::destruct_contents() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:319:11 (libjvm.so+0x7e5274) #4 Arena::~Arena() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:283:3 (libjvm.so+0x7e52ea) #5 ResourceArea::~ResourceArea() /home/wave/workspace/jdk_master/src/hotspot/share/memory/resourceArea.hpp:44:7 (libjvm.so+0xae9d18) #6 Thread::~Thread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:449:3 (libjvm.so+0x1e2214a) #7 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1903:1 (libjvm.so+0x1e27af4) #8 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1856:27 (libjvm.so+0x1e27b3c) #9 ThreadsSMRSupport::smr_delete(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/threadSMR.cpp:1027:3 (libjvm.so+0x1e47408) #10 JavaThread::smr_delete() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:208:5 (libjvm.so+0x1e20e73) #11 jni_DetachCurrentThread /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4173:11 (libjvm.so+0x137ae7a) #12 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x67e9) Previous read of size 8 at 0x7b1800003ab0 by thread T14: [failed to restore the stack] Location is heap block of size 88 at 0x7b1800003a80 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 AllocateHeap(unsigned long, MemoryType, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:52:10 (libjvm.so+0x7bfd34) #4 CHeapObj<(MemoryType)8>::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.hpp:193:19 (libjvm.so+0x7e6799) #5 ChunkPool::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:135 (libjvm.so+0x7e6799) #6 chunkpool_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:154:3 (libjvm.so+0x7e4441) #7 vm_init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:102:3 (libjvm.so+0x11c9b6a) #8 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3846:3 (libjvm.so+0x1e3108c) #9 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #10 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #11 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9728, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T14 (tid=9742, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 WatcherThread::WatcherThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1375:7 (libjvm.so+0x1e25399) #3 WatcherThread::start() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1514:9 (libjvm.so+0x1e2598f) #4 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4105:7 (libjvm.so+0x1e31bb1) #5 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #6 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #7 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 in ChunkPool::free(Chunk*) And the openjdk code is below, you could see there is a ThreadCritical which derives from pthread_mutex and implements lock/unlock in the ctor/dtor: // Return a chunk to the pool void free(Chunk* chunk) { assert(chunk->length() + Chunk::aligned_overhead_size() == _size, "bad size"); ThreadCritical tc; _num_used--; // Add chunk to list chunk->set_next(_first); 92: _first = chunk; 93: _num_chunks++; } -----Original Message----- From: Dmitry Vyukov Sent: Tuesday, February 25, 2020 1:30 PM To: Jie He Cc: tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread On Tue, Feb 25, 2020 at 6:20 AM Jie He wrote: > > Hi > > I built openjdk with enabling fanitizer=thread recently, and got a lot of warning by tsan even a helloworld case. > > Then I investigated around more than 15 warnings, found they could be divided into 3 classes: > > > 1. Benign races, commonly, there is a comment to indicate why it is safe in MP. +thread-sanitizer mailing list Hi Jie, C++ standard still calls this data race and renders behavior of the program as undefined. Comments don't fix bugs ;) > 2. Runtime atomic implementation, in x86, the atomic load and store will be translated to platformload/store. I assume here platformload/store are implemented as plain loads and stores. These may need to be changed at least in tsan build (but maybe in all builds, because see the previous point). I am not aware of the openjdk portability requirements, but today the __atomic_load/store_n intrinsics may be a good choice. > 3. Runtime function implement protected by MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. > > For 3, I couldn't understand why tsan couldn't recognize that it's safe and protected by a lock. In TSAN document, it said pthread functions are supported. > So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to mark, then got the warning, double of lock, it seems tsan knows MutexLocker is a lock. > > Or because one of the conflicting threads lost its stack, in this kind of warning, there is one out of the two threads fails to restore its stack. > It may result that tsan only knows the thread calls read/write, but doesn't know the memory operation is protected by a lock. > These threads couldn't restore the stack are JIT threads/Java threads? I need to fix the tsan symbolizer function first for this situation? Yes, tsan understands pthread mutex natively, no annotations required. You may try to increase history_size flag to get the second stack: https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags "failed to restore stack trace" should not lead to false positives either. Please post several full tsan reports and links to the corresponding source code. From aeubanks at google.com Tue Feb 25 18:47:34 2020 From: aeubanks at google.com (Arthur Eubanks) Date: Tue, 25 Feb 2020 10:47:34 -0800 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: It's going to be a lot of work to get TSAN working with the JVM code. There is lots of synchronization done in JVM that's not visible to TSAN, often through OS-specific mechanisms. You'd have to add TSAN callbacks in the code (what we've done in various places in this project) or change Hotspot to use something that TSAN instruments (as Dmitry suggested). And again as Dmitry pointed out, lots of it is technically incorrect C++ code that happens to work on current C++ compilers (in fact we've run into actual production issues with this where we have a custom memcpy that doesn't always do word-atomic moves even when copying an entire word). IMO it's definitely possible to get Hotspot into a state that makes TSAN mostly happy (maybe some branching code paths that only execute when TSAN is turned on), but it'd likely require moving to at least C++11 for portability reasons, then a loooot of investigation into Hotspot's synchronization and TSAN's callbacks. You might be interested in the tsanExternalDecls.hpp file and the corresponding code in LLVM. On Tue, Feb 25, 2020 at 12:16 AM Jie He wrote: > Add more report for the previous case, false race between javamain thread > and g1 sampling thread. > > WARNING: ThreadSanitizer: data race (pid=9770) > Read of size 8 at 0x7b7c00002360 by thread T1: > #0 Mutex::assert_owner(Thread*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 > (libjvm.so+0x1924838) > #1 Monitor::notify() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:166:3 > (libjvm.so+0x1924c1f) > #2 G1YoungRemSetSamplingThread::stop_service() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:128:12 > (libjvm.so+0x10c50d7) > #3 ConcurrentGCThread::stop() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 > (libjvm.so+0xc9fa11) > #4 G1CollectedHeap::stop() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 > (libjvm.so+0xfae058) > #5 before_exit(JavaThread*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 > (libjvm.so+0x1215b84) > #6 Threads::destroy_vm() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 > (libjvm.so+0x1e326ee) > #7 jni_DestroyJavaVM_inner(JavaVM_*) > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 > (libjvm.so+0x137a5eb) > #8 jni_DestroyJavaVM > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 > (libjvm.so+0x137a3ef) > #9 JavaMain > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 > (libjli.so+0x681b) > > Previous write of size 8 at 0x7b7c00002360 by thread T6: > [failed to restore the stack] > > Location is heap block of size 3168 at 0x7b7c00001c00 allocated by > thread T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 > (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, > AllocFailStrategy::AllocFailEnum) > /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 > (libjvm.so+0x7bfc2d) > #3 Thread::allocate(unsigned long, bool, MemoryType) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 > (libjvm.so+0x1e20a54) > #4 Thread::operator new(unsigned long) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 > (libjvm.so+0x86cf52) > #5 G1CollectedHeap::initialize_young_gen_sampling_thread() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 > (libjvm.so+0xfaca14) > #6 G1CollectedHeap::initialize() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 > (libjvm.so+0xfadd31) > #7 Universe::initialize_heap() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 > (libjvm.so+0x1e88c55) > #8 universe_init() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 > (libjvm.so+0x1e8872b) > #9 init_globals() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 > (libjvm.so+0x11c9bc1) > #10 Threads::create_vm(JavaVMInitArgs*, bool*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 > (libjvm.so+0x1e312a1) > #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 > (libjvm.so+0x1379e74) > #12 JNI_CreateJavaVM > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 > (libjvm.so+0x1379d0f) > #13 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 > (libjli.so+0x6974) > > Thread T1 (tid=9772, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread > /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 > (libjli.so+0xb53f) > > Thread T6 (tid=9778, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) > /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 > (libjvm.so+0x19e4413) > #2 ConcurrentGCThread::create_and_start(ThreadPriority) > /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 > (libjvm.so+0xc9f7d4) > #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 > (libjvm.so+0x10c49a5) > #4 G1CollectedHeap::initialize_young_gen_sampling_thread() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 > (libjvm.so+0xfaca3a) > #5 G1CollectedHeap::initialize() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 > (libjvm.so+0xfadd31) > #6 Universe::initialize_heap() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 > (libjvm.so+0x1e88c55) > #7 universe_init() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 > (libjvm.so+0x1e8872b) > #8 init_globals() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 > (libjvm.so+0x11c9bc1) > #9 Threads::create_vm(JavaVMInitArgs*, bool*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 > (libjvm.so+0x1e312a1) > #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 > (libjvm.so+0x1379e74) > #11 JNI_CreateJavaVM > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 > (libjvm.so+0x1379d0f) > #12 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 > (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 > in Mutex::assert_owner(Thread*) > > > B.R > Jie He > > -----Original Message----- > From: Jie He > Sent: Tuesday, February 25, 2020 4:09 PM > To: Jie He ; Dmitry Vyukov > Cc: nd ; thread-sanitizer ; > tsan-dev at openjdk.java.net > Subject: RE: build openjdk with fsanitizer=thread > > Hi Dmitry > > another case that data race exists between javamain thread and > G1YoungRemSetSampling Thread, I believe both of them are C++ function > threads. But tsan doesn't restore G1 thread stack successfully, and seems > to consider there is no lock to protect Mutex's member var _owner. > > See the following reports by tsan, code could be found in github > https://github.com/openjdk/jdk/tree/master/src/hotspot/share: > > WARNING: ThreadSanitizer: data race (pid=9787) > Read of size 8 at 0x7b7c00002360 by thread T1: > #0 Mutex::owned_by_self() const > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 > (libjvm.so+0x1925966) > #1 assert_lock_strong(Mutex const*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 > (libjvm.so+0x19272fa) > #2 MutexLocker::~MutexLocker() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 > (libjvm.so+0x33da8a) > #3 G1YoungRemSetSamplingThread::stop_service() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 > (libjvm.so+0x10c50e0) > #4 ConcurrentGCThread::stop() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 > (libjvm.so+0xc9fa11) > #5 G1CollectedHeap::stop() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 > (libjvm.so+0xfae058) > #6 before_exit(JavaThread*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 > (libjvm.so+0x1215b84) > #7 Threads::destroy_vm() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 > (libjvm.so+0x1e326ee) > #8 jni_DestroyJavaVM_inner(JavaVM_*) > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 > (libjvm.so+0x137a5eb) > #9 jni_DestroyJavaVM > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 > (libjvm.so+0x137a3ef) > #10 JavaMain > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 > (libjli.so+0x681b) > > Previous write of size 8 at 0x7b7c00002360 by thread T6: > [failed to restore the stack] > > Location is heap block of size 3168 at 0x7b7c00001c00 allocated by > thread T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 > (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, > AllocFailStrategy::AllocFailEnum) > /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 > (libjvm.so+0x7bfc2d) > #3 Thread::allocate(unsigned long, bool, MemoryType) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 > (libjvm.so+0x1e20a54) > #4 Thread::operator new(unsigned long) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 > (libjvm.so+0x86cf52) > #5 G1CollectedHeap::initialize_young_gen_sampling_thread() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 > (libjvm.so+0xfaca14) > #6 G1CollectedHeap::initialize() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 > (libjvm.so+0xfadd31) > #7 Universe::initialize_heap() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 > (libjvm.so+0x1e88c55) > #8 universe_init() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 > (libjvm.so+0x1e8872b) > #9 init_globals() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 > (libjvm.so+0x11c9bc1) > #10 Threads::create_vm(JavaVMInitArgs*, bool*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 > (libjvm.so+0x1e312a1) > #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 > (libjvm.so+0x1379e74) > #12 JNI_CreateJavaVM > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 > (libjvm.so+0x1379d0f) > #13 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 > (libjli.so+0x6974) > > Thread T1 (tid=9789, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread > /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 > (libjli.so+0xb53f) > > Thread T6 (tid=9795, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) > /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 > (libjvm.so+0x19e4413) > #2 ConcurrentGCThread::create_and_start(ThreadPriority) > /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 > (libjvm.so+0xc9f7d4) > #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 > (libjvm.so+0x10c49a5) > #4 G1CollectedHeap::initialize_young_gen_sampling_thread() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 > (libjvm.so+0xfaca3a) > #5 G1CollectedHeap::initialize() > /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 > (libjvm.so+0xfadd31) > #6 Universe::initialize_heap() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 > (libjvm.so+0x1e88c55) > #7 universe_init() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 > (libjvm.so+0x1e8872b) > #8 init_globals() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 > (libjvm.so+0x11c9bc1) > #9 Threads::create_vm(JavaVMInitArgs*, bool*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 > (libjvm.so+0x1e312a1) > #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 > (libjvm.so+0x1379e74) > #11 JNI_CreateJavaVM > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 > (libjvm.so+0x1379d0f) > #12 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 > (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 > in Mutex::owned_by_self() const > > > B.R > Jie He > > -----Original Message----- > From: tsan-dev On Behalf Of Jie He > Sent: Tuesday, February 25, 2020 2:19 PM > To: Dmitry Vyukov > Cc: nd ; thread-sanitizer ; > tsan-dev at openjdk.java.net > Subject: RE: build openjdk with fsanitizer=thread > > Hi Dmitry > > Yes, so I don't think the first 2 classes of warnings are data race, they > are out of scope of tsan. > > Currently, I'm not sure if the second thread is JIT thread. > But seems the second thread knows a tsan_read8 behavior happened at least > in IR level. > > like the following tsan reports, I have changed the history_size to 4: > > WARNING: ThreadSanitizer: data race (pid=9726) > Write of size 8 at 0x7b1800003ab0 by thread T1: > #0 ChunkPool::free(Chunk*) > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 > (libjvm.so+0x7e6fe0) > #1 Chunk::operator delete(void*) > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:207:54 > (libjvm.so+0x7e47f4) > #2 Chunk::chop() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:225:5 > (libjvm.so+0x7e4994) > #3 Arena::destruct_contents() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:319:11 > (libjvm.so+0x7e5274) > #4 Arena::~Arena() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:283:3 > (libjvm.so+0x7e52ea) > #5 ResourceArea::~ResourceArea() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/resourceArea.hpp:44:7 > (libjvm.so+0xae9d18) > #6 Thread::~Thread() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:449:3 > (libjvm.so+0x1e2214a) > #7 JavaThread::~JavaThread() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1903:1 > (libjvm.so+0x1e27af4) > #8 JavaThread::~JavaThread() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1856:27 > (libjvm.so+0x1e27b3c) > #9 ThreadsSMRSupport::smr_delete(JavaThread*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/threadSMR.cpp:1027:3 > (libjvm.so+0x1e47408) > #10 JavaThread::smr_delete() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:208:5 > (libjvm.so+0x1e20e73) > #11 jni_DetachCurrentThread > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4173:11 > (libjvm.so+0x137ae7a) > #12 JavaMain > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 > (libjli.so+0x67e9) > > Previous read of size 8 at 0x7b1800003ab0 by thread T14: > [failed to restore the stack] > > Location is heap block of size 88 at 0x7b1800003a80 allocated by thread > T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 > (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, > AllocFailStrategy::AllocFailEnum) > /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 > (libjvm.so+0x7bfc2d) > #3 AllocateHeap(unsigned long, MemoryType, > AllocFailStrategy::AllocFailEnum) > /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:52:10 > (libjvm.so+0x7bfd34) > #4 CHeapObj<(MemoryType)8>::operator new(unsigned long) > /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.hpp:193:19 > (libjvm.so+0x7e6799) > #5 ChunkPool::initialize() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:135 > (libjvm.so+0x7e6799) > #6 chunkpool_init() > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:154:3 > (libjvm.so+0x7e4441) > #7 vm_init_globals() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:102:3 > (libjvm.so+0x11c9b6a) > #8 Threads::create_vm(JavaVMInitArgs*, bool*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3846:3 > (libjvm.so+0x1e3108c) > #9 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 > (libjvm.so+0x1379e74) > #10 JNI_CreateJavaVM > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 > (libjvm.so+0x1379d0f) > #11 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 > (libjli.so+0x6974) > > Thread T1 (tid=9728, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread > /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 > (libjli.so+0xb53f) > > Thread T14 (tid=9742, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) > /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 > (libjvm.so+0x19e4413) > #2 WatcherThread::WatcherThread() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1375:7 > (libjvm.so+0x1e25399) > #3 WatcherThread::start() > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1514:9 > (libjvm.so+0x1e2598f) > #4 Threads::create_vm(JavaVMInitArgs*, bool*) > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4105:7 > (libjvm.so+0x1e31bb1) > #5 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 > (libjvm.so+0x1379e74) > #6 JNI_CreateJavaVM > /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 > (libjvm.so+0x1379d0f) > #7 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 > (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 in > ChunkPool::free(Chunk*) > > > And the openjdk code is below, you could see there is a ThreadCritical > which derives from pthread_mutex and implements lock/unlock in the > ctor/dtor: > > // Return a chunk to the pool > void free(Chunk* chunk) { > assert(chunk->length() + Chunk::aligned_overhead_size() == _size, > "bad size"); > ThreadCritical tc; > _num_used--; > > // Add chunk to list > chunk->set_next(_first); > 92: _first = chunk; > 93: _num_chunks++; > } > > > -----Original Message----- > From: Dmitry Vyukov > Sent: Tuesday, February 25, 2020 1:30 PM > To: Jie He > Cc: tsan-dev at openjdk.java.net; nd ; thread-sanitizer < > thread-sanitizer at googlegroups.com> > Subject: Re: build openjdk with fsanitizer=thread > > On Tue, Feb 25, 2020 at 6:20 AM Jie He wrote: > > > > Hi > > > > I built openjdk with enabling fanitizer=thread recently, and got a lot > of warning by tsan even a helloworld case. > > > > Then I investigated around more than 15 warnings, found they could be > divided into 3 classes: > > > > > > 1. Benign races, commonly, there is a comment to indicate why it is > safe in MP. > > +thread-sanitizer mailing list > > Hi Jie, > > C++ standard still calls this data race and renders behavior of the > program as undefined. Comments don't fix bugs ;) > > > 2. Runtime atomic implementation, in x86, the atomic load and store > will be translated to platformload/store. > > I assume here platformload/store are implemented as plain loads and > stores. These may need to be changed at least in tsan build (but maybe in > all builds, because see the previous point). I am not aware of the openjdk > portability requirements, but today the __atomic_load/store_n intrinsics > may be a good choice. > > > 3. Runtime function implement protected by > MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. > > > > For 3, I couldn't understand why tsan couldn't recognize that it's safe > and protected by a lock. In TSAN document, it said pthread functions are > supported. > > So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to > mark, then got the warning, double of lock, it seems tsan knows MutexLocker > is a lock. > > > > Or because one of the conflicting threads lost its stack, in this kind > of warning, there is one out of the two threads fails to restore its stack. > > It may result that tsan only knows the thread calls read/write, but > doesn't know the memory operation is protected by a lock. > > These threads couldn't restore the stack are JIT threads/Java threads? I > need to fix the tsan symbolizer function first for this situation? > > Yes, tsan understands pthread mutex natively, no annotations required. > You may try to increase history_size flag to get the second stack: > https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags > "failed to restore stack trace" should not lead to false positives either. > > Please post several full tsan reports and links to the corresponding > source code. > From Jie.He at arm.com Wed Feb 26 04:05:07 2020 From: Jie.He at arm.com (Jie He) Date: Wed, 26 Feb 2020 04:05:07 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: Hi Arthur Yes, I agree with you, openjdk with tsan is too far away. There is a lot of work to do. But firstly I?m just wondering in the current case, why tsan couldn?t restore the stack? The thread looks like a normal thread created by pthread_create with a thread func thread_native_entry and a pointer parameter ?thread?, then in function thread_native_entry, it launches the real thread body by calling thread->call_run(). The process is clear and all in c++ env. I?m not familiar about llvm symbolizer in tsan, but I think it should be handled like other normal thread. Thanks Jie He From: Arthur Eubanks Sent: Wednesday, February 26, 2020 2:48 AM To: Jie He Cc: Dmitry Vyukov ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread It's going to be a lot of work to get TSAN working with the JVM code. There is lots of synchronization done in JVM that's not visible to TSAN, often through OS-specific mechanisms. You'd have to add TSAN callbacks in the code (what we've done in various places in this project) or change Hotspot to use something that TSAN instruments (as Dmitry suggested). And again as Dmitry pointed out, lots of it is technically incorrect C++ code that happens to work on current C++ compilers (in fact we've run into actual production issues with this where we have a custom memcpy that doesn't always do word-atomic moves even when copying an entire word). IMO it's definitely possible to get Hotspot into a state that makes TSAN mostly happy (maybe some branching code paths that only execute when TSAN is turned on), but it'd likely require moving to at least C++11 for portability reasons, then a loooot of investigation into Hotspot's synchronization and TSAN's callbacks. You might be interested in the tsanExternalDecls.hpp file and the corresponding code in LLVM. On Tue, Feb 25, 2020 at 12:16 AM Jie He > wrote: Add more report for the previous case, false race between javamain thread and g1 sampling thread. WARNING: ThreadSanitizer: data race (pid=9770) Read of size 8 at 0x7b7c00002360 by thread T1: #0 Mutex::assert_owner(Thread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 (libjvm.so+0x1924838) #1 Monitor::notify() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:166:3 (libjvm.so+0x1924c1f) #2 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:128:12 (libjvm.so+0x10c50d7) #3 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) #4 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) #5 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) #6 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) #7 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) #8 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) #9 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x681b) Previous write of size 8 at 0x7b7c00002360 by thread T6: [failed to restore the stack] Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #13 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9772, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T6 (tid=9778, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #12 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 in Mutex::assert_owner(Thread*) B.R Jie He -----Original Message----- From: Jie He > Sent: Tuesday, February 25, 2020 4:09 PM To: Jie He >; Dmitry Vyukov > Cc: nd >; thread-sanitizer >; tsan-dev at openjdk.java.net Subject: RE: build openjdk with fsanitizer=thread Hi Dmitry another case that data race exists between javamain thread and G1YoungRemSetSampling Thread, I believe both of them are C++ function threads. But tsan doesn't restore G1 thread stack successfully, and seems to consider there is no lock to protect Mutex's member var _owner. See the following reports by tsan, code could be found in github https://github.com/openjdk/jdk/tree/master/src/hotspot/share: WARNING: ThreadSanitizer: data race (pid=9787) Read of size 8 at 0x7b7c00002360 by thread T1: #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) #10 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x681b) Previous write of size 8 at 0x7b7c00002360 by thread T6: [failed to restore the stack] Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #13 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9789, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T6 (tid=9795, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #12 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 in Mutex::owned_by_self() const B.R Jie He -----Original Message----- From: tsan-dev > On Behalf Of Jie He Sent: Tuesday, February 25, 2020 2:19 PM To: Dmitry Vyukov > Cc: nd >; thread-sanitizer >; tsan-dev at openjdk.java.net Subject: RE: build openjdk with fsanitizer=thread Hi Dmitry Yes, so I don't think the first 2 classes of warnings are data race, they are out of scope of tsan. Currently, I'm not sure if the second thread is JIT thread. But seems the second thread knows a tsan_read8 behavior happened at least in IR level. like the following tsan reports, I have changed the history_size to 4: WARNING: ThreadSanitizer: data race (pid=9726) Write of size 8 at 0x7b1800003ab0 by thread T1: #0 ChunkPool::free(Chunk*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 (libjvm.so+0x7e6fe0) #1 Chunk::operator delete(void*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:207:54 (libjvm.so+0x7e47f4) #2 Chunk::chop() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:225:5 (libjvm.so+0x7e4994) #3 Arena::destruct_contents() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:319:11 (libjvm.so+0x7e5274) #4 Arena::~Arena() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:283:3 (libjvm.so+0x7e52ea) #5 ResourceArea::~ResourceArea() /home/wave/workspace/jdk_master/src/hotspot/share/memory/resourceArea.hpp:44:7 (libjvm.so+0xae9d18) #6 Thread::~Thread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:449:3 (libjvm.so+0x1e2214a) #7 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1903:1 (libjvm.so+0x1e27af4) #8 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1856:27 (libjvm.so+0x1e27b3c) #9 ThreadsSMRSupport::smr_delete(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/threadSMR.cpp:1027:3 (libjvm.so+0x1e47408) #10 JavaThread::smr_delete() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:208:5 (libjvm.so+0x1e20e73) #11 jni_DetachCurrentThread /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4173:11 (libjvm.so+0x137ae7a) #12 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x67e9) Previous read of size 8 at 0x7b1800003ab0 by thread T14: [failed to restore the stack] Location is heap block of size 88 at 0x7b1800003a80 allocated by thread T1: #0 malloc (java+0x421ee7) #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) #3 AllocateHeap(unsigned long, MemoryType, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:52:10 (libjvm.so+0x7bfd34) #4 CHeapObj<(MemoryType)8>::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.hpp:193:19 (libjvm.so+0x7e6799) #5 ChunkPool::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:135 (libjvm.so+0x7e6799) #6 chunkpool_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:154:3 (libjvm.so+0x7e4441) #7 vm_init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:102:3 (libjvm.so+0x11c9b6a) #8 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3846:3 (libjvm.so+0x1e3108c) #9 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #10 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #11 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) Thread T1 (tid=9728, running) created by main thread at: #0 pthread_create (java+0x4233d5) #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) Thread T14 (tid=9742, running) created by thread T1 at: #0 pthread_create (java+0x4233d5) #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) #2 WatcherThread::WatcherThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1375:7 (libjvm.so+0x1e25399) #3 WatcherThread::start() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1514:9 (libjvm.so+0x1e2598f) #4 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4105:7 (libjvm.so+0x1e31bb1) #5 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) #6 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) #7 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 in ChunkPool::free(Chunk*) And the openjdk code is below, you could see there is a ThreadCritical which derives from pthread_mutex and implements lock/unlock in the ctor/dtor: // Return a chunk to the pool void free(Chunk* chunk) { assert(chunk->length() + Chunk::aligned_overhead_size() == _size, "bad size"); ThreadCritical tc; _num_used--; // Add chunk to list chunk->set_next(_first); 92: _first = chunk; 93: _num_chunks++; } -----Original Message----- From: Dmitry Vyukov > Sent: Tuesday, February 25, 2020 1:30 PM To: Jie He > Cc: tsan-dev at openjdk.java.net; nd >; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread On Tue, Feb 25, 2020 at 6:20 AM Jie He > wrote: > > Hi > > I built openjdk with enabling fanitizer=thread recently, and got a lot of warning by tsan even a helloworld case. > > Then I investigated around more than 15 warnings, found they could be divided into 3 classes: > > > 1. Benign races, commonly, there is a comment to indicate why it is safe in MP. +thread-sanitizer mailing list Hi Jie, C++ standard still calls this data race and renders behavior of the program as undefined. Comments don't fix bugs ;) > 2. Runtime atomic implementation, in x86, the atomic load and store will be translated to platformload/store. I assume here platformload/store are implemented as plain loads and stores. These may need to be changed at least in tsan build (but maybe in all builds, because see the previous point). I am not aware of the openjdk portability requirements, but today the __atomic_load/store_n intrinsics may be a good choice. > 3. Runtime function implement protected by MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. > > For 3, I couldn't understand why tsan couldn't recognize that it's safe and protected by a lock. In TSAN document, it said pthread functions are supported. > So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to mark, then got the warning, double of lock, it seems tsan knows MutexLocker is a lock. > > Or because one of the conflicting threads lost its stack, in this kind of warning, there is one out of the two threads fails to restore its stack. > It may result that tsan only knows the thread calls read/write, but doesn't know the memory operation is protected by a lock. > These threads couldn't restore the stack are JIT threads/Java threads? I need to fix the tsan symbolizer function first for this situation? Yes, tsan understands pthread mutex natively, no annotations required. You may try to increase history_size flag to get the second stack: https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags "failed to restore stack trace" should not lead to false positives either. Please post several full tsan reports and links to the corresponding source code. From dvyukov at google.com Wed Feb 26 11:39:05 2020 From: dvyukov at google.com (Dmitry Vyukov) Date: Wed, 26 Feb 2020 12:39:05 +0100 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: Using history_size=7 may help with [failed to restore the stack] thing. It's unrelated to the way threads are created nor llvm-symbolizer. It's just memorizing all info about all previous memory accesses that ever happened in the program with full stacks is generally impossible, there are implementation limits. Tasn understand lock-free synchronization and atomic operations. That should not cause false positives and is in scope of tsan checking. For example, it is effective at detecting missed memory barriers (even if checking is done on strong x86). Re the race in Mutex/_owner. Looks like a real bug, owned_by_self does an unsynchronized read concurrently with writes by other threads (if the mutex is not owned by the current thread). C++ declares this as undefined behavior. Re ChunkPool::free race. Hard to say without the second stack. Maybe tsan does not understand that ThreadCritical thing. What is it? I can't find the impl. FTR the other sources are here: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/mutex.cpp https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/arena.cpp On Wed, Feb 26, 2020 at 5:05 AM Jie He wrote: > > Hi Arthur > Yes, I agree with you, openjdk with tsan is too far away. There is a lot of work to do. > But firstly I?m just wondering in the current case, why tsan couldn?t restore the stack? > > The thread looks like a normal thread created by pthread_create with a thread func thread_native_entry and a pointer parameter ?thread?, > > then in function thread_native_entry, it launches the real thread body by calling thread->call_run(). > > The process is clear and all in c++ env. I?m not familiar about llvm symbolizer in tsan, but I think it should be handled like other normal thread. > > > > Thanks > > Jie He > > From: Arthur Eubanks > Sent: Wednesday, February 26, 2020 2:48 AM > To: Jie He > Cc: Dmitry Vyukov ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > > > It's going to be a lot of work to get TSAN working with the JVM code. There is lots of synchronization done in JVM that's not visible to TSAN, often through OS-specific mechanisms. You'd have to add TSAN callbacks in the code (what we've done in various places in this project) or change Hotspot to use something that TSAN instruments (as Dmitry suggested). And again as Dmitry pointed out, lots of it is technically incorrect C++ code that happens to work on current C++ compilers (in fact we've run into actual production issues with this where we have a custom memcpy that doesn't always do word-atomic moves even when copying an entire word). > > > > IMO it's definitely possible to get Hotspot into a state that makes TSAN mostly happy (maybe some branching code paths that only execute when TSAN is turned on), but it'd likely require moving to at least C++11 for portability reasons, then a loooot of investigation into Hotspot's synchronization and TSAN's callbacks. You might be interested in the tsanExternalDecls.hpp file and the corresponding code in LLVM. > > > > > > > > On Tue, Feb 25, 2020 at 12:16 AM Jie He wrote: > > Add more report for the previous case, false race between javamain thread and g1 sampling thread. > > WARNING: ThreadSanitizer: data race (pid=9770) > Read of size 8 at 0x7b7c00002360 by thread T1: > #0 Mutex::assert_owner(Thread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 (libjvm.so+0x1924838) > #1 Monitor::notify() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:166:3 (libjvm.so+0x1924c1f) > #2 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:128:12 (libjvm.so+0x10c50d7) > #3 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > #4 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > #5 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > #6 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > #7 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > #8 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > #9 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x681b) > > Previous write of size 8 at 0x7b7c00002360 by thread T6: > [failed to restore the stack] > > Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) > #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) > #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) > #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) > #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) > #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) > #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) > #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) > #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) > #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #13 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) > > Thread T1 (tid=9772, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) > > Thread T6 (tid=9778, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) > #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) > #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) > #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) > #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) > #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) > #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) > #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) > #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) > #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #12 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 in Mutex::assert_owner(Thread*) > > > B.R > Jie He > > -----Original Message----- > From: Jie He > Sent: Tuesday, February 25, 2020 4:09 PM > To: Jie He ; Dmitry Vyukov > Cc: nd ; thread-sanitizer ; tsan-dev at openjdk.java.net > Subject: RE: build openjdk with fsanitizer=thread > > Hi Dmitry > > another case that data race exists between javamain thread and G1YoungRemSetSampling Thread, I believe both of them are C++ function threads. But tsan doesn't restore G1 thread stack successfully, and seems to consider there is no lock to protect Mutex's member var _owner. > > See the following reports by tsan, code could be found in github https://github.com/openjdk/jdk/tree/master/src/hotspot/share: > > WARNING: ThreadSanitizer: data race (pid=9787) > Read of size 8 at 0x7b7c00002360 by thread T1: > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > #10 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x681b) > > Previous write of size 8 at 0x7b7c00002360 by thread T6: > [failed to restore the stack] > > Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) > #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) > #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) > #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) > #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) > #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) > #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) > #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) > #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) > #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #13 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) > > Thread T1 (tid=9789, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) > > Thread T6 (tid=9795, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) > #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) > #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) > #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) > #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) > #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) > #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) > #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) > #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) > #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #12 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 in Mutex::owned_by_self() const > > > B.R > Jie He > > -----Original Message----- > From: tsan-dev On Behalf Of Jie He > Sent: Tuesday, February 25, 2020 2:19 PM > To: Dmitry Vyukov > Cc: nd ; thread-sanitizer ; tsan-dev at openjdk.java.net > Subject: RE: build openjdk with fsanitizer=thread > > Hi Dmitry > > Yes, so I don't think the first 2 classes of warnings are data race, they are out of scope of tsan. > > Currently, I'm not sure if the second thread is JIT thread. > But seems the second thread knows a tsan_read8 behavior happened at least in IR level. > > like the following tsan reports, I have changed the history_size to 4: > > WARNING: ThreadSanitizer: data race (pid=9726) > Write of size 8 at 0x7b1800003ab0 by thread T1: > #0 ChunkPool::free(Chunk*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 (libjvm.so+0x7e6fe0) > #1 Chunk::operator delete(void*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:207:54 (libjvm.so+0x7e47f4) > #2 Chunk::chop() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:225:5 (libjvm.so+0x7e4994) > #3 Arena::destruct_contents() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:319:11 (libjvm.so+0x7e5274) > #4 Arena::~Arena() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:283:3 (libjvm.so+0x7e52ea) > #5 ResourceArea::~ResourceArea() /home/wave/workspace/jdk_master/src/hotspot/share/memory/resourceArea.hpp:44:7 (libjvm.so+0xae9d18) > #6 Thread::~Thread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:449:3 (libjvm.so+0x1e2214a) > #7 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1903:1 (libjvm.so+0x1e27af4) > #8 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1856:27 (libjvm.so+0x1e27b3c) > #9 ThreadsSMRSupport::smr_delete(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/threadSMR.cpp:1027:3 (libjvm.so+0x1e47408) > #10 JavaThread::smr_delete() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:208:5 (libjvm.so+0x1e20e73) > #11 jni_DetachCurrentThread /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4173:11 (libjvm.so+0x137ae7a) > #12 JavaMain /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:560:5 (libjli.so+0x67e9) > > Previous read of size 8 at 0x7b1800003ab0 by thread T14: > [failed to restore the stack] > > Location is heap block of size 88 at 0x7b1800003a80 allocated by thread T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) > #3 AllocateHeap(unsigned long, MemoryType, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:52:10 (libjvm.so+0x7bfd34) > #4 CHeapObj<(MemoryType)8>::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.hpp:193:19 (libjvm.so+0x7e6799) > #5 ChunkPool::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:135 (libjvm.so+0x7e6799) > #6 chunkpool_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:154:3 (libjvm.so+0x7e4441) > #7 vm_init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:102:3 (libjvm.so+0x11c9b6a) > #8 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3846:3 (libjvm.so+0x1e3108c) > #9 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #10 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #11 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) > > Thread T1 (tid=9728, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_md_solinux.c:754:9 (libjli.so+0xb53f) > > Thread T14 (tid=9742, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) > #2 WatcherThread::WatcherThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1375:7 (libjvm.so+0x1e25399) > #3 WatcherThread::start() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1514:9 (libjvm.so+0x1e2598f) > #4 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4105:7 (libjvm.so+0x1e31bb1) > #5 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #6 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #7 InitializeJVM /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java.c:1538:9 (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 in ChunkPool::free(Chunk*) > > > And the openjdk code is below, you could see there is a ThreadCritical which derives from pthread_mutex and implements lock/unlock in the ctor/dtor: > > // Return a chunk to the pool > void free(Chunk* chunk) { > assert(chunk->length() + Chunk::aligned_overhead_size() == _size, "bad size"); > ThreadCritical tc; > _num_used--; > > // Add chunk to list > chunk->set_next(_first); > 92: _first = chunk; > 93: _num_chunks++; > } > > > -----Original Message----- > From: Dmitry Vyukov > Sent: Tuesday, February 25, 2020 1:30 PM > To: Jie He > Cc: tsan-dev at openjdk.java.net; nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > On Tue, Feb 25, 2020 at 6:20 AM Jie He wrote: > > > > Hi > > > > I built openjdk with enabling fanitizer=thread recently, and got a lot of warning by tsan even a helloworld case. > > > > Then I investigated around more than 15 warnings, found they could be divided into 3 classes: > > > > > > 1. Benign races, commonly, there is a comment to indicate why it is safe in MP. > > +thread-sanitizer mailing list > > Hi Jie, > > C++ standard still calls this data race and renders behavior of the > program as undefined. Comments don't fix bugs ;) > > > 2. Runtime atomic implementation, in x86, the atomic load and store will be translated to platformload/store. > > I assume here platformload/store are implemented as plain loads and stores. These may need to be changed at least in tsan build (but maybe in all builds, because see the previous point). I am not aware of the openjdk portability requirements, but today the __atomic_load/store_n intrinsics may be a good choice. > > > 3. Runtime function implement protected by MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. > > > > For 3, I couldn't understand why tsan couldn't recognize that it's safe and protected by a lock. In TSAN document, it said pthread functions are supported. > > So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to mark, then got the warning, double of lock, it seems tsan knows MutexLocker is a lock. > > > > Or because one of the conflicting threads lost its stack, in this kind of warning, there is one out of the two threads fails to restore its stack. > > It may result that tsan only knows the thread calls read/write, but doesn't know the memory operation is protected by a lock. > > These threads couldn't restore the stack are JIT threads/Java threads? I need to fix the tsan symbolizer function first for this situation? > > Yes, tsan understands pthread mutex natively, no annotations required. > You may try to increase history_size flag to get the second stack: > https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags > "failed to restore stack trace" should not lead to false positives either. > > Please post several full tsan reports and links to the corresponding source code. From Jie.He at arm.com Wed Feb 26 13:46:12 2020 From: Jie.He at arm.com (Jie He) Date: Wed, 26 Feb 2020 13:46:12 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: Is this case you mentioned "_owned_by_self" > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > #10 JavaMain Seems thread::current() is a TLS variable, refer to the code: inline Thread* Thread::current_or_null() { #ifndef USE_LIBRARY_BASED_TLS_ONLY return _thr_current; #else if (ThreadLocalStorage::is_initialized()) { return ThreadLocalStorage::thread(); } return NULL; #endif } ------------------------------------------------------------------------------------------------------------------------------------------------------- For chunk_pool case, ThreadCritical implementation code in threadCritical_linux.cpp \jdk\src\hotspot\os\linux Seems they implemented a recursive lock here. I searched the places where access _num_chunks, all have a ThreadCritical to protect. At last, I extend the history size by " export TSAN_OPTIONS="suppressions=/home/wave/workspace/jdk_master/tsan.supp halt_on_error=1 history_size=7" " But still couldn't restore the stack for the second thread. Thanks Jie He -----Original Message----- From: Dmitry Vyukov Sent: Wednesday, February 26, 2020 7:39 PM To: Jie He Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread Using history_size=7 may help with [failed to restore the stack] thing. It's unrelated to the way threads are created nor llvm-symbolizer. It's just memorizing all info about all previous memory accesses that ever happened in the program with full stacks is generally impossible, there are implementation limits. Tasn understand lock-free synchronization and atomic operations. That should not cause false positives and is in scope of tsan checking. For example, it is effective at detecting missed memory barriers (even if checking is done on strong x86). Re the race in Mutex/_owner. Looks like a real bug, owned_by_self does an unsynchronized read concurrently with writes by other threads (if the mutex is not owned by the current thread). C++ declares this as undefined behavior. Re ChunkPool::free race. Hard to say without the second stack. Maybe tsan does not understand that ThreadCritical thing. What is it? I can't find the impl. FTR the other sources are here: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/mutex.cpp https://github.com/openjdk/jdk/blob/master/src/hotspot/share/memory/arena.cpp On Wed, Feb 26, 2020 at 5:05 AM Jie He wrote: > > Hi Arthur > Yes, I agree with you, openjdk with tsan is too far away. There is a lot of work to do. > But firstly I?m just wondering in the current case, why tsan couldn?t restore the stack? > > The thread looks like a normal thread created by pthread_create with a > thread func thread_native_entry and a pointer parameter ?thread?, > > then in function thread_native_entry, it launches the real thread body by calling thread->call_run(). > > The process is clear and all in c++ env. I?m not familiar about llvm symbolizer in tsan, but I think it should be handled like other normal thread. > > > > Thanks > > Jie He > > From: Arthur Eubanks > Sent: Wednesday, February 26, 2020 2:48 AM > To: Jie He > Cc: Dmitry Vyukov ; tsan-dev at openjdk.java.net; nd > ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > > > It's going to be a lot of work to get TSAN working with the JVM code. There is lots of synchronization done in JVM that's not visible to TSAN, often through OS-specific mechanisms. You'd have to add TSAN callbacks in the code (what we've done in various places in this project) or change Hotspot to use something that TSAN instruments (as Dmitry suggested). And again as Dmitry pointed out, lots of it is technically incorrect C++ code that happens to work on current C++ compilers (in fact we've run into actual production issues with this where we have a custom memcpy that doesn't always do word-atomic moves even when copying an entire word). > > > > IMO it's definitely possible to get Hotspot into a state that makes TSAN mostly happy (maybe some branching code paths that only execute when TSAN is turned on), but it'd likely require moving to at least C++11 for portability reasons, then a loooot of investigation into Hotspot's synchronization and TSAN's callbacks. You might be interested in the tsanExternalDecls.hpp file and the corresponding code in LLVM. > > > > > > > > On Tue, Feb 25, 2020 at 12:16 AM Jie He wrote: > > Add more report for the previous case, false race between javamain thread and g1 sampling thread. > > WARNING: ThreadSanitizer: data race (pid=9770) > Read of size 8 at 0x7b7c00002360 by thread T1: > #0 Mutex::assert_owner(Thread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:343:3 (libjvm.so+0x1924838) > #1 Monitor::notify() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:166:3 (libjvm.so+0x1924c1f) > #2 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:128:12 (libjvm.so+0x10c50d7) > #3 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > #4 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > #5 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > #6 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > #7 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > #8 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > #9 JavaMain > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:560:5 (libjli.so+0x681b) > > Previous write of size 8 at 0x7b7c00002360 by thread T6: > [failed to restore the stack] > > Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) > #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) > #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) > #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) > #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) > #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) > #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) > #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) > #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) > #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #13 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:1538:9 (libjli.so+0x6974) > > Thread T1 (tid=9772, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread > /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_ > md_solinux.c:754:9 (libjli.so+0xb53f) > > Thread T6 (tid=9778, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) > #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) > #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) > #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) > #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) > #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) > #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) > #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) > #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) > #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #12 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:1538:9 (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:34 > 3:3 in Mutex::assert_owner(Thread*) > > > B.R > Jie He > > -----Original Message----- > From: Jie He > Sent: Tuesday, February 25, 2020 4:09 PM > To: Jie He ; Dmitry Vyukov > Cc: nd ; thread-sanitizer > ; tsan-dev at openjdk.java.net > Subject: RE: build openjdk with fsanitizer=thread > > Hi Dmitry > > another case that data race exists between javamain thread and G1YoungRemSetSampling Thread, I believe both of them are C++ function threads. But tsan doesn't restore G1 thread stack successfully, and seems to consider there is no lock to protect Mutex's member var _owner. > > See the following reports by tsan, code could be found in github https://github.com/openjdk/jdk/tree/master/src/hotspot/share: > > WARNING: ThreadSanitizer: data race (pid=9787) > Read of size 8 at 0x7b7c00002360 by thread T1: > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > #10 JavaMain > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:560:5 (libjli.so+0x681b) > > Previous write of size 8 at 0x7b7c00002360 by thread T6: > [failed to restore the stack] > > Location is heap block of size 3168 at 0x7b7c00001c00 allocated by thread T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) > #3 Thread::allocate(unsigned long, bool, MemoryType) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:178:43 (libjvm.so+0x1e20a54) > #4 Thread::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.hpp:211:52 (libjvm.so+0x86cf52) > #5 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:32 (libjvm.so+0xfaca14) > #6 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) > #7 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) > #8 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) > #9 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) > #10 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) > #11 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #12 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #13 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:1538:9 (libjli.so+0x6974) > > Thread T1 (tid=9789, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread > /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_ > md_solinux.c:754:9 (libjli.so+0xb53f) > > Thread T6 (tid=9795, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) > #2 ConcurrentGCThread::create_and_start(ThreadPriority) /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:37:7 (libjvm.so+0xc9f7d4) > #3 G1YoungRemSetSamplingThread::G1YoungRemSetSamplingThread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:47:3 (libjvm.so+0x10c49a5) > #4 G1CollectedHeap::initialize_young_gen_sampling_thread() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1642:36 (libjvm.so+0xfaca3a) > #5 G1CollectedHeap::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1823:11 (libjvm.so+0xfadd31) > #6 Universe::initialize_heap() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:719:33 (libjvm.so+0x1e88c55) > #7 universe_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/universe.cpp:653:17 (libjvm.so+0x1e8872b) > #8 init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:117:17 (libjvm.so+0x11c9bc1) > #9 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3882:17 (libjvm.so+0x1e312a1) > #10 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #11 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #12 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:1538:9 (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race > /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:30 > 1:10 in Mutex::owned_by_self() const > > > B.R > Jie He > > -----Original Message----- > From: tsan-dev On Behalf Of Jie He > Sent: Tuesday, February 25, 2020 2:19 PM > To: Dmitry Vyukov > Cc: nd ; thread-sanitizer > ; tsan-dev at openjdk.java.net > Subject: RE: build openjdk with fsanitizer=thread > > Hi Dmitry > > Yes, so I don't think the first 2 classes of warnings are data race, they are out of scope of tsan. > > Currently, I'm not sure if the second thread is JIT thread. > But seems the second thread knows a tsan_read8 behavior happened at least in IR level. > > like the following tsan reports, I have changed the history_size to 4: > > WARNING: ThreadSanitizer: data race (pid=9726) > Write of size 8 at 0x7b1800003ab0 by thread T1: > #0 ChunkPool::free(Chunk*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93:16 (libjvm.so+0x7e6fe0) > #1 Chunk::operator delete(void*) /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:207:54 (libjvm.so+0x7e47f4) > #2 Chunk::chop() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:225:5 (libjvm.so+0x7e4994) > #3 Arena::destruct_contents() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:319:11 (libjvm.so+0x7e5274) > #4 Arena::~Arena() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:283:3 (libjvm.so+0x7e52ea) > #5 ResourceArea::~ResourceArea() /home/wave/workspace/jdk_master/src/hotspot/share/memory/resourceArea.hpp:44:7 (libjvm.so+0xae9d18) > #6 Thread::~Thread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:449:3 (libjvm.so+0x1e2214a) > #7 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1903:1 (libjvm.so+0x1e27af4) > #8 JavaThread::~JavaThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1856:27 (libjvm.so+0x1e27b3c) > #9 ThreadsSMRSupport::smr_delete(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/threadSMR.cpp:1027:3 (libjvm.so+0x1e47408) > #10 JavaThread::smr_delete() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:208:5 (libjvm.so+0x1e20e73) > #11 jni_DetachCurrentThread /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4173:11 (libjvm.so+0x137ae7a) > #12 JavaMain > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:560:5 (libjli.so+0x67e9) > > Previous read of size 8 at 0x7b1800003ab0 by thread T14: > [failed to restore the stack] > > Location is heap block of size 88 at 0x7b1800003a80 allocated by thread T1: > #0 malloc (java+0x421ee7) > #1 os::malloc(unsigned long, MemoryType, NativeCallStack const&) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/os.cpp:714:18 (libjvm.so+0x19d7e62) > #2 AllocateHeap(unsigned long, MemoryType, NativeCallStack const&, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:42:21 (libjvm.so+0x7bfc2d) > #3 AllocateHeap(unsigned long, MemoryType, AllocFailStrategy::AllocFailEnum) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.cpp:52:10 (libjvm.so+0x7bfd34) > #4 CHeapObj<(MemoryType)8>::operator new(unsigned long) /home/wave/workspace/jdk_master/src/hotspot/share/memory/allocation.hpp:193:19 (libjvm.so+0x7e6799) > #5 ChunkPool::initialize() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:135 (libjvm.so+0x7e6799) > #6 chunkpool_init() /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:154:3 (libjvm.so+0x7e4441) > #7 vm_init_globals() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/init.cpp:102:3 (libjvm.so+0x11c9b6a) > #8 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:3846:3 (libjvm.so+0x1e3108c) > #9 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #10 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #11 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:1538:9 (libjli.so+0x6974) > > Thread T1 (tid=9728, running) created by main thread at: > #0 pthread_create (java+0x4233d5) > #1 CallJavaMainInNewThread > /home/wave/workspace/jdk_master/src/java.base/unix/native/libjli/java_ > md_solinux.c:754:9 (libjli.so+0xb53f) > > Thread T14 (tid=9742, running) created by thread T1 at: > #0 pthread_create (java+0x4233d5) > #1 os::create_thread(Thread*, os::ThreadType, unsigned long) /home/wave/workspace/jdk_master/src/hotspot/os/linux/os_linux.cpp:926:15 (libjvm.so+0x19e4413) > #2 WatcherThread::WatcherThread() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1375:7 (libjvm.so+0x1e25399) > #3 WatcherThread::start() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:1514:9 (libjvm.so+0x1e2598f) > #4 Threads::create_vm(JavaVMInitArgs*, bool*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4105:7 (libjvm.so+0x1e31bb1) > #5 JNI_CreateJavaVM_inner(JavaVM_**, void**, void*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3852:12 (libjvm.so+0x1379e74) > #6 JNI_CreateJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3935:14 (libjvm.so+0x1379d0f) > #7 InitializeJVM > /home/wave/workspace/jdk_master/src/java.base/share/native/libjli/java > .c:1538:9 (libjli.so+0x6974) > > SUMMARY: ThreadSanitizer: data race > /home/wave/workspace/jdk_master/src/hotspot/share/memory/arena.cpp:93: > 16 in ChunkPool::free(Chunk*) > > > And the openjdk code is below, you could see there is a ThreadCritical which derives from pthread_mutex and implements lock/unlock in the ctor/dtor: > > // Return a chunk to the pool > void free(Chunk* chunk) { > assert(chunk->length() + Chunk::aligned_overhead_size() == _size, "bad size"); > ThreadCritical tc; > _num_used--; > > // Add chunk to list > chunk->set_next(_first); > 92: _first = chunk; > 93: _num_chunks++; > } > > > -----Original Message----- > From: Dmitry Vyukov > Sent: Tuesday, February 25, 2020 1:30 PM > To: Jie He > Cc: tsan-dev at openjdk.java.net; nd ; thread-sanitizer > > Subject: Re: build openjdk with fsanitizer=thread > > On Tue, Feb 25, 2020 at 6:20 AM Jie He wrote: > > > > Hi > > > > I built openjdk with enabling fanitizer=thread recently, and got a lot of warning by tsan even a helloworld case. > > > > Then I investigated around more than 15 warnings, found they could be divided into 3 classes: > > > > > > 1. Benign races, commonly, there is a comment to indicate why it is safe in MP. > > +thread-sanitizer mailing list > > Hi Jie, > > C++ standard still calls this data race and renders behavior of the > program as undefined. Comments don't fix bugs ;) > > > 2. Runtime atomic implementation, in x86, the atomic load and store will be translated to platformload/store. > > I assume here platformload/store are implemented as plain loads and stores. These may need to be changed at least in tsan build (but maybe in all builds, because see the previous point). I am not aware of the openjdk portability requirements, but today the __atomic_load/store_n intrinsics may be a good choice. > > > 3. Runtime function implement protected by MutexLocker/ThreadCritical, which finally implemented by pthread_mutex. > > > > For 3, I couldn't understand why tsan couldn't recognize that it's safe and protected by a lock. In TSAN document, it said pthread functions are supported. > > So I tried to add annotation(ANNOTATION_RWCLOCK_ACQUIRED/RELEASED) to mark, then got the warning, double of lock, it seems tsan knows MutexLocker is a lock. > > > > Or because one of the conflicting threads lost its stack, in this kind of warning, there is one out of the two threads fails to restore its stack. > > It may result that tsan only knows the thread calls read/write, but doesn't know the memory operation is protected by a lock. > > These threads couldn't restore the stack are JIT threads/Java threads? I need to fix the tsan symbolizer function first for this situation? > > Yes, tsan understands pthread mutex natively, no annotations required. > You may try to increase history_size flag to get the second stack: > https://github.com/google/sanitizers/wiki/ThreadSanitizerFlags > "failed to restore stack trace" should not lead to false positives either. > > Please post several full tsan reports and links to the corresponding source code. From dvyukov at google.com Wed Feb 26 13:55:40 2020 From: dvyukov at google.com (Dmitry Vyukov) Date: Wed, 26 Feb 2020 14:55:40 +0100 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > Is this case you mentioned "_owned_by_self" > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > #10 JavaMain > > Seems thread::current() is a TLS variable, refer to the code: > > inline Thread* Thread::current_or_null() { > #ifndef USE_LIBRARY_BASED_TLS_ONLY > return _thr_current; > #else > if (ThreadLocalStorage::is_initialized()) { > return ThreadLocalStorage::thread(); > } > return NULL; > #endif > } The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race. From dvyukov at google.com Wed Feb 26 14:01:51 2020 From: dvyukov at google.com (Dmitry Vyukov) Date: Wed, 26 Feb 2020 15:01:51 +0100 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > Is this case you mentioned "_owned_by_self" > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > #10 JavaMain > > Seems thread::current() is a TLS variable, refer to the code: > > inline Thread* Thread::current_or_null() { > #ifndef USE_LIBRARY_BASED_TLS_ONLY > return _thr_current; > #else > if (ThreadLocalStorage::is_initialized()) { > return ThreadLocalStorage::thread(); > } > return NULL; > #endif > } > > ------------------------------------------------------------------------------------------------------------------------------------------------------- > For chunk_pool case, > > ThreadCritical implementation code in threadCritical_linux.cpp \jdk\src\hotspot\os\linux > Seems they implemented a recursive lock here. This should be understood by tsan then. FTR the code is here: https://github.com/openjdk/jdk/blob/master/src/hotspot/os/linux/threadCritical_linux.cpp > I searched the places where access _num_chunks, all have a ThreadCritical to protect. Hard to say without the second stack... Maybe not what we think it is. E.g. unsafe publication of the ChunkPool, or maybe race is on the Chunk object itself... From Jie.He at arm.com Wed Feb 26 14:11:42 2020 From: Jie.He at arm.com (Jie He) Date: Wed, 26 Feb 2020 14:11:42 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: Oh, I misunderstand the race object, but _owner also appears to be protected by the member var _lock within class Mutex. -----Original Message----- From: Dmitry Vyukov Sent: Wednesday, February 26, 2020 9:56 PM To: Jie He Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > Is this case you mentioned "_owned_by_self" > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > #10 JavaMain > > Seems thread::current() is a TLS variable, refer to the code: > > inline Thread* Thread::current_or_null() { #ifndef > USE_LIBRARY_BASED_TLS_ONLY > return _thr_current; > #else > if (ThreadLocalStorage::is_initialized()) { > return ThreadLocalStorage::thread(); > } > return NULL; > #endif > } The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race. From dvyukov at google.com Wed Feb 26 16:12:21 2020 From: dvyukov at google.com (Dmitry Vyukov) Date: Wed, 26 Feb 2020 17:12:21 +0100 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: This is puzzling. It seems to be protected by the PlatformMutex, which is pthread_mutex_t underneath: https://github.com/openjdk/jdk/blob/0da0333a06aef32ce7af3448dfa38c1f40c32826/src/hotspot/os/posix/os_posix.inline.hpp Tsan does understand pthread_mutex_t, that's a very widely used functionality. So there is something fishy going on. Do we know that it intercepts/understands at least some of pthread mutexes? Or there is some systematic problem which results in tsan missing just all of pthread? By any chance don't you have something like "called_from_lib:pthread" in your tsan.supp? There is also something fishy in the reports you posted. First one happens inside of MutexLocker::~MutexLocker(). The question is: why the race wasn't reported in the MutexLocker constructor where we write to owner_ as well? I don't see any reasonable execution where write in ctor does not race, but write in dtor races... The same in the second report: the race is in notify method, but we also called lock before which write to owner_. Why that did not race?... On Wed, Feb 26, 2020 at 3:11 PM Jie He wrote: > > Oh, I misunderstand the race object, > but _owner also appears to be protected by the member var _lock within class Mutex. > > -----Original Message----- > From: Dmitry Vyukov > Sent: Wednesday, February 26, 2020 9:56 PM > To: Jie He > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > > > Is this case you mentioned "_owned_by_self" > > > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > > #10 JavaMain > > > > Seems thread::current() is a TLS variable, refer to the code: > > > > inline Thread* Thread::current_or_null() { #ifndef > > USE_LIBRARY_BASED_TLS_ONLY > > return _thr_current; > > #else > > if (ThreadLocalStorage::is_initialized()) { > > return ThreadLocalStorage::thread(); > > } > > return NULL; > > #endif > > } > > The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race. From Jie.He at arm.com Thu Feb 27 03:45:05 2020 From: Jie.He at arm.com (Jie He) Date: Thu, 27 Feb 2020 03:45:05 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: Yes, you remind me, I ever added a suppression in supp file "call_from_lib:libjvm.so" to try to get a clear report, but seems it didn't succeed. There are still a lot of reports produced, some threads have stack trace and recognize pthread function well, but others not, is it correct behavior? I think no reports should be produced if I add this strong suppression. Then I removed the supp item, seems Mutex could be handled well, but still a lot of reports exist, I will check and suppress them if it's not a race. Thank u. -----Original Message----- From: Dmitry Vyukov Sent: Thursday, February 27, 2020 12:12 AM To: Jie He Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread This is puzzling. It seems to be protected by the PlatformMutex, which is pthread_mutex_t underneath: https://github.com/openjdk/jdk/blob/0da0333a06aef32ce7af3448dfa38c1f40c32826/src/hotspot/os/posix/os_posix.inline.hpp Tsan does understand pthread_mutex_t, that's a very widely used functionality. So there is something fishy going on. Do we know that it intercepts/understands at least some of pthread mutexes? Or there is some systematic problem which results in tsan missing just all of pthread? By any chance don't you have something like "called_from_lib:pthread" in your tsan.supp? There is also something fishy in the reports you posted. First one happens inside of MutexLocker::~MutexLocker(). The question is: why the race wasn't reported in the MutexLocker constructor where we write to owner_ as well? I don't see any reasonable execution where write in ctor does not race, but write in dtor races... The same in the second report: the race is in notify method, but we also called lock before which write to owner_. Why that did not race?... On Wed, Feb 26, 2020 at 3:11 PM Jie He wrote: > > Oh, I misunderstand the race object, > but _owner also appears to be protected by the member var _lock within class Mutex. > > -----Original Message----- > From: Dmitry Vyukov > Sent: Wednesday, February 26, 2020 9:56 PM > To: Jie He > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > > > Is this case you mentioned "_owned_by_self" > > > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > > #10 JavaMain > > > > Seems thread::current() is a TLS variable, refer to the code: > > > > inline Thread* Thread::current_or_null() { #ifndef > > USE_LIBRARY_BASED_TLS_ONLY > > return _thr_current; > > #else > > if (ThreadLocalStorage::is_initialized()) { > > return ThreadLocalStorage::thread(); > > } > > return NULL; > > #endif > > } > > The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race. From dvyukov at google.com Thu Feb 27 06:21:48 2020 From: dvyukov at google.com (Dmitry Vyukov) Date: Thu, 27 Feb 2020 07:21:48 +0100 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: On Thu, Feb 27, 2020 at 4:45 AM Jie He wrote: > > Yes, you remind me, I ever added a suppression in supp file "call_from_lib:libjvm.so" to try to get a clear report, > but seems it didn't succeed. There are still a lot of reports produced, some threads have stack trace and recognize pthread function well, but others not, Oh, if the JVM is compiled to libjvm.so, this is not going to work, this literally tells tsan to ignore all pthread interceptors coming from libjvm.so. call_from_lib is meant only for non-instrumented libraries, because compiler instrumentation is not ignored. So what we have with this suppression is that all synchronization (pthread) is ignored but memory accesses are not. This is intended that this will produce tons of nonsensical reports. And even for non-instrumented libraries call_from_lib may lead to false positives if the non-instrumented library synchronizes any of instrumented code. call_from_lib is meant for very special contexts, generally don't use it. > is it correct behavior? I think no reports should be produced if I add this strong suppression. Yes. And it's not strong in all respects, only in some. > Then I removed the supp item, seems Mutex could be handled well, but still a lot of reports exist, I will check and suppress them if it's not a race. Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. > Thank u. > > -----Original Message----- > From: Dmitry Vyukov > Sent: Thursday, February 27, 2020 12:12 AM > To: Jie He > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > This is puzzling. It seems to be protected by the PlatformMutex, which is pthread_mutex_t underneath: > https://github.com/openjdk/jdk/blob/0da0333a06aef32ce7af3448dfa38c1f40c32826/src/hotspot/os/posix/os_posix.inline.hpp > > Tsan does understand pthread_mutex_t, that's a very widely used functionality. So there is something fishy going on. > Do we know that it intercepts/understands at least some of pthread mutexes? Or there is some systematic problem which results in tsan missing just all of pthread? By any chance don't you have something like "called_from_lib:pthread" in your tsan.supp? > > There is also something fishy in the reports you posted. First one happens inside of MutexLocker::~MutexLocker(). The question is: why the race wasn't reported in the MutexLocker constructor where we write to owner_ as well? I don't see any reasonable execution where write in ctor does not race, but write in dtor races... > The same in the second report: the race is in notify method, but we also called lock before which write to owner_. Why that did not race?... > > > > > On Wed, Feb 26, 2020 at 3:11 PM Jie He wrote: > > > > Oh, I misunderstand the race object, > > but _owner also appears to be protected by the member var _lock within class Mutex. > > > > -----Original Message----- > > From: Dmitry Vyukov > > Sent: Wednesday, February 26, 2020 9:56 PM > > To: Jie He > > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > > nd ; thread-sanitizer > > Subject: Re: build openjdk with fsanitizer=thread > > > > On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > > > > > Is this case you mentioned "_owned_by_self" > > > > > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > > > #10 JavaMain > > > > > > Seems thread::current() is a TLS variable, refer to the code: > > > > > > inline Thread* Thread::current_or_null() { #ifndef > > > USE_LIBRARY_BASED_TLS_ONLY > > > return _thr_current; > > > #else > > > if (ThreadLocalStorage::is_initialized()) { > > > return ThreadLocalStorage::thread(); > > > } > > > return NULL; > > > #endif > > > } > > > > The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race. From Jie.He at arm.com Thu Feb 27 06:56:43 2020 From: Jie.He at arm.com (Jie He) Date: Thu, 27 Feb 2020 06:56:43 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: > Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. You said tsan is not supposed to produce any false positives according to c/c++ rules, but the papers about dynamic data detection like djit+, lockset, they all said false positive is unavoidable, even using pure happens-before mode because the algorithm doesn't understand the synchronization semantic indeed. And also the paper "ThreadSanitizer ? data race detection in practice" lists some false positive cases, even a simple producer-consumer model. And thanks your explanation and patience again. -----Original Message----- From: Dmitry Vyukov Sent: Thursday, February 27, 2020 2:22 PM To: Jie He Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread On Thu, Feb 27, 2020 at 4:45 AM Jie He wrote: > > Yes, you remind me, I ever added a suppression in supp file > "call_from_lib:libjvm.so" to try to get a clear report, but seems it > didn't succeed. There are still a lot of reports produced, some > threads have stack trace and recognize pthread function well, but > others not, Oh, if the JVM is compiled to libjvm.so, this is not going to work, this literally tells tsan to ignore all pthread interceptors coming from libjvm.so. call_from_lib is meant only for non-instrumented libraries, because compiler instrumentation is not ignored. So what we have with this suppression is that all synchronization (pthread) is ignored but memory accesses are not. This is intended that this will produce tons of nonsensical reports. And even for non-instrumented libraries call_from_lib may lead to false positives if the non-instrumented library synchronizes any of instrumented code. call_from_lib is meant for very special contexts, generally don't use it. > is it correct behavior? I think no reports should be produced if I add this strong suppression. Yes. And it's not strong in all respects, only in some. > Then I removed the supp item, seems Mutex could be handled well, but still a lot of reports exist, I will check and suppress them if it's not a race. Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. > Thank u. > > -----Original Message----- > From: Dmitry Vyukov > Sent: Thursday, February 27, 2020 12:12 AM > To: Jie He > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > This is puzzling. It seems to be protected by the PlatformMutex, which is pthread_mutex_t underneath: > https://github.com/openjdk/jdk/blob/0da0333a06aef32ce7af3448dfa38c1f40 > c32826/src/hotspot/os/posix/os_posix.inline.hpp > > Tsan does understand pthread_mutex_t, that's a very widely used functionality. So there is something fishy going on. > Do we know that it intercepts/understands at least some of pthread mutexes? Or there is some systematic problem which results in tsan missing just all of pthread? By any chance don't you have something like "called_from_lib:pthread" in your tsan.supp? > > There is also something fishy in the reports you posted. First one happens inside of MutexLocker::~MutexLocker(). The question is: why the race wasn't reported in the MutexLocker constructor where we write to owner_ as well? I don't see any reasonable execution where write in ctor does not race, but write in dtor races... > The same in the second report: the race is in notify method, but we also called lock before which write to owner_. Why that did not race?... > > > > > On Wed, Feb 26, 2020 at 3:11 PM Jie He wrote: > > > > Oh, I misunderstand the race object, but _owner also appears to be > > protected by the member var _lock within class Mutex. > > > > -----Original Message----- > > From: Dmitry Vyukov > > Sent: Wednesday, February 26, 2020 9:56 PM > > To: Jie He > > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > > nd ; thread-sanitizer > > > > Subject: Re: build openjdk with fsanitizer=thread > > > > On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > > > > > Is this case you mentioned "_owned_by_self" > > > > > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > > > #10 JavaMain > > > > > > Seems thread::current() is a TLS variable, refer to the code: > > > > > > inline Thread* Thread::current_or_null() { #ifndef > > > USE_LIBRARY_BASED_TLS_ONLY > > > return _thr_current; > > > #else > > > if (ThreadLocalStorage::is_initialized()) { > > > return ThreadLocalStorage::thread(); > > > } > > > return NULL; > > > #endif > > > } > > > > The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race. From dvyukov at google.com Thu Feb 27 08:41:18 2020 From: dvyukov at google.com (Dmitry Vyukov) Date: Thu, 27 Feb 2020 09:41:18 +0100 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: On Thu, Feb 27, 2020 at 7:56 AM Jie He wrote: > > > Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. > > You said tsan is not supposed to produce any false positives according to c/c++ rules, but the papers about dynamic data detection like djit+, lockset, they all said false positive is unavoidable, even using pure happens-before mode because the algorithm doesn't understand the synchronization semantic indeed. And also the paper "ThreadSanitizer ? data race detection in practice" lists some false positive cases, even a simple producer-consumer model. > > And thanks your explanation and patience again. Lockset, yes, it inherently does not understand lock-free synchronization and atomic operations. The ThreadSanitizer paper is about (very) old version of ThreadSanitizer that was valgrind-based and is long dead. That old TSan could work in 2 modes: lock-set and happens-before. Lock-set mode indeed had lots of false positives, that's probably what the paper refers to. The happens-before was better but had a limitation related to Valgrind binary instrumentation -- it's not possible to see/intercept atomic operations on binary level. So it requires annotations for atomic operations. The current TSan is only happens-before-based and is based on compiler instrumentation, so it understands std::atomic and __atomic_ builtins (and C atomics as consequence I think). So it does not have these classes of false positives. I lied a bit about the complete absence of false positives. TSan still does not handle stand-alone memory barriers (std::atomic_thread_fence), there are no fundamental limitations but it's not implemented. It still understands atomic loads and stores with fine-grained memory ordering constraints (acquire/release/relaxed). But what I meant is: don't write off everything you see as a false positive right away and suppress it. Then there is also little point in applying TSan in the first place. It is capable of finding very tricky race conditions in tricky code, e.g. it's used extensively on V8 and to some degree enabled switch a concurrent GC by catching a number of races between runtime/mutators. > -----Original Message----- > From: Dmitry Vyukov > Sent: Thursday, February 27, 2020 2:22 PM > To: Jie He > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > On Thu, Feb 27, 2020 at 4:45 AM Jie He wrote: > > > > Yes, you remind me, I ever added a suppression in supp file > > "call_from_lib:libjvm.so" to try to get a clear report, but seems it > > didn't succeed. There are still a lot of reports produced, some > > threads have stack trace and recognize pthread function well, but > > others not, > > Oh, if the JVM is compiled to libjvm.so, this is not going to work, this literally tells tsan to ignore all pthread interceptors coming from libjvm.so. > call_from_lib is meant only for non-instrumented libraries, because compiler instrumentation is not ignored. So what we have with this suppression is that all synchronization (pthread) is ignored but memory accesses are not. This is intended that this will produce tons of nonsensical reports. > And even for non-instrumented libraries call_from_lib may lead to false positives if the non-instrumented library synchronizes any of instrumented code. > call_from_lib is meant for very special contexts, generally don't use it. > > > > is it correct behavior? I think no reports should be produced if I add this strong suppression. > > Yes. And it's not strong in all respects, only in some. > > > > Then I removed the supp item, seems Mutex could be handled well, but still a lot of reports exist, I will check and suppress them if it's not a race. > > Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. > > > > > Thank u. > > > > -----Original Message----- > > From: Dmitry Vyukov > > Sent: Thursday, February 27, 2020 12:12 AM > > To: Jie He > > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > > nd ; thread-sanitizer > > Subject: Re: build openjdk with fsanitizer=thread > > > > This is puzzling. It seems to be protected by the PlatformMutex, which is pthread_mutex_t underneath: > > https://github.com/openjdk/jdk/blob/0da0333a06aef32ce7af3448dfa38c1f40 > > c32826/src/hotspot/os/posix/os_posix.inline.hpp > > > > Tsan does understand pthread_mutex_t, that's a very widely used functionality. So there is something fishy going on. > > Do we know that it intercepts/understands at least some of pthread mutexes? Or there is some systematic problem which results in tsan missing just all of pthread? By any chance don't you have something like "called_from_lib:pthread" in your tsan.supp? > > > > There is also something fishy in the reports you posted. First one happens inside of MutexLocker::~MutexLocker(). The question is: why the race wasn't reported in the MutexLocker constructor where we write to owner_ as well? I don't see any reasonable execution where write in ctor does not race, but write in dtor races... > > The same in the second report: the race is in notify method, but we also called lock before which write to owner_. Why that did not race?... > > > > > > > > > > On Wed, Feb 26, 2020 at 3:11 PM Jie He wrote: > > > > > > Oh, I misunderstand the race object, but _owner also appears to be > > > protected by the member var _lock within class Mutex. > > > > > > -----Original Message----- > > > From: Dmitry Vyukov > > > Sent: Wednesday, February 26, 2020 9:56 PM > > > To: Jie He > > > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > > > nd ; thread-sanitizer > > > > > > Subject: Re: build openjdk with fsanitizer=thread > > > > > > On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > > > > > > > Is this case you mentioned "_owned_by_self" > > > > > > > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > > > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > > > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > > > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > > > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > > > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > > > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > > > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > > > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > > > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > > > > #10 JavaMain > > > > > > > > Seems thread::current() is a TLS variable, refer to the code: > > > > > > > > inline Thread* Thread::current_or_null() { #ifndef > > > > USE_LIBRARY_BASED_TLS_ONLY > > > > return _thr_current; > > > > #else > > > > if (ThreadLocalStorage::is_initialized()) { > > > > return ThreadLocalStorage::thread(); > > > > } > > > > return NULL; > > > > #endif > > > > } > > > > > > The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race. From Jie.He at arm.com Thu Feb 27 09:01:15 2020 From: Jie.He at arm.com (Jie He) Date: Thu, 27 Feb 2020 09:01:15 +0000 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: So the basic thinking about data race detection I got from these papers before is that little false positives, little real races, Is it still correct for TSAN2? And as I understand, the false positive case "message queue" in old paper 6.4.2 is not considered as a false positive any more by TSAN2 even using atomic operations, right? thanks -----Original Message----- From: Dmitry Vyukov Sent: Thursday, February 27, 2020 4:41 PM To: Jie He Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer Subject: Re: build openjdk with fsanitizer=thread On Thu, Feb 27, 2020 at 7:56 AM Jie He wrote: > > > Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. > > You said tsan is not supposed to produce any false positives according to c/c++ rules, but the papers about dynamic data detection like djit+, lockset, they all said false positive is unavoidable, even using pure happens-before mode because the algorithm doesn't understand the synchronization semantic indeed. And also the paper "ThreadSanitizer ? data race detection in practice" lists some false positive cases, even a simple producer-consumer model. > > And thanks your explanation and patience again. Lockset, yes, it inherently does not understand lock-free synchronization and atomic operations. The ThreadSanitizer paper is about (very) old version of ThreadSanitizer that was valgrind-based and is long dead. That old TSan could work in 2 modes: lock-set and happens-before. Lock-set mode indeed had lots of false positives, that's probably what the paper refers to. The happens-before was better but had a limitation related to Valgrind binary instrumentation -- it's not possible to see/intercept atomic operations on binary level. So it requires annotations for atomic operations. The current TSan is only happens-before-based and is based on compiler instrumentation, so it understands std::atomic and __atomic_ builtins (and C atomics as consequence I think). So it does not have these classes of false positives. I lied a bit about the complete absence of false positives. TSan still does not handle stand-alone memory barriers (std::atomic_thread_fence), there are no fundamental limitations but it's not implemented. It still understands atomic loads and stores with fine-grained memory ordering constraints (acquire/release/relaxed). But what I meant is: don't write off everything you see as a false positive right away and suppress it. Then there is also little point in applying TSan in the first place. It is capable of finding very tricky race conditions in tricky code, e.g. it's used extensively on V8 and to some degree enabled switch a concurrent GC by catching a number of races between runtime/mutators. > -----Original Message----- > From: Dmitry Vyukov > Sent: Thursday, February 27, 2020 2:22 PM > To: Jie He > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > On Thu, Feb 27, 2020 at 4:45 AM Jie He wrote: > > > > Yes, you remind me, I ever added a suppression in supp file > > "call_from_lib:libjvm.so" to try to get a clear report, but seems it > > didn't succeed. There are still a lot of reports produced, some > > threads have stack trace and recognize pthread function well, but > > others not, > > Oh, if the JVM is compiled to libjvm.so, this is not going to work, this literally tells tsan to ignore all pthread interceptors coming from libjvm.so. > call_from_lib is meant only for non-instrumented libraries, because compiler instrumentation is not ignored. So what we have with this suppression is that all synchronization (pthread) is ignored but memory accesses are not. This is intended that this will produce tons of nonsensical reports. > And even for non-instrumented libraries call_from_lib may lead to false positives if the non-instrumented library synchronizes any of instrumented code. > call_from_lib is meant for very special contexts, generally don't use it. > > > > is it correct behavior? I think no reports should be produced if I add this strong suppression. > > Yes. And it's not strong in all respects, only in some. > > > > Then I removed the supp item, seems Mutex could be handled well, but still a lot of reports exist, I will check and suppress them if it's not a race. > > Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. > > > > > Thank u. > > > > -----Original Message----- > > From: Dmitry Vyukov > > Sent: Thursday, February 27, 2020 12:12 AM > > To: Jie He > > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > > nd ; thread-sanitizer > > > > Subject: Re: build openjdk with fsanitizer=thread > > > > This is puzzling. It seems to be protected by the PlatformMutex, which is pthread_mutex_t underneath: > > https://github.com/openjdk/jdk/blob/0da0333a06aef32ce7af3448dfa38c1f > > 40 c32826/src/hotspot/os/posix/os_posix.inline.hpp > > > > Tsan does understand pthread_mutex_t, that's a very widely used functionality. So there is something fishy going on. > > Do we know that it intercepts/understands at least some of pthread mutexes? Or there is some systematic problem which results in tsan missing just all of pthread? By any chance don't you have something like "called_from_lib:pthread" in your tsan.supp? > > > > There is also something fishy in the reports you posted. First one happens inside of MutexLocker::~MutexLocker(). The question is: why the race wasn't reported in the MutexLocker constructor where we write to owner_ as well? I don't see any reasonable execution where write in ctor does not race, but write in dtor races... > > The same in the second report: the race is in notify method, but we also called lock before which write to owner_. Why that did not race?... > > > > > > > > > > On Wed, Feb 26, 2020 at 3:11 PM Jie He wrote: > > > > > > Oh, I misunderstand the race object, but _owner also appears to be > > > protected by the member var _lock within class Mutex. > > > > > > -----Original Message----- > > > From: Dmitry Vyukov > > > Sent: Wednesday, February 26, 2020 9:56 PM > > > To: Jie He > > > Cc: Arthur Eubanks ; > > > tsan-dev at openjdk.java.net; nd ; thread-sanitizer > > > > > > Subject: Re: build openjdk with fsanitizer=thread > > > > > > On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > > > > > > > Is this case you mentioned "_owned_by_self" > > > > > > > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > > > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > > > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > > > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > > > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > > > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > > > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > > > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > > > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > > > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > > > > #10 JavaMain > > > > > > > > Seems thread::current() is a TLS variable, refer to the code: > > > > > > > > inline Thread* Thread::current_or_null() { #ifndef > > > > USE_LIBRARY_BASED_TLS_ONLY > > > > return _thr_current; > > > > #else > > > > if (ThreadLocalStorage::is_initialized()) { > > > > return ThreadLocalStorage::thread(); > > > > } > > > > return NULL; > > > > #endif > > > > } > > > > > > The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race. From dvyukov at google.com Thu Feb 27 09:07:22 2020 From: dvyukov at google.com (Dmitry Vyukov) Date: Thu, 27 Feb 2020 10:07:22 +0100 Subject: build openjdk with fsanitizer=thread In-Reply-To: References: Message-ID: On Thu, Feb 27, 2020 at 10:01 AM Jie He wrote: > > So the basic thinking about data race detection I got from these papers before is that little false positives, little real races, > Is it still correct for TSAN2? We apply tsan extensively to very large code bases (100MLOC) with almost zero annotations and zero false positives. It has found tens of thousands of bugs at least (impossible to keep track of exact numbers). > And as I understand, the false positive case "message queue" in old paper 6.4.2 is not considered as a false positive any more by TSAN2 even using atomic operations, right? Yes, atomic operations, publication, privatization, etc all that works in the new tsan (well, in any happens-before detector). > thanks > > -----Original Message----- > From: Dmitry Vyukov > Sent: Thursday, February 27, 2020 4:41 PM > To: Jie He > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; nd ; thread-sanitizer > Subject: Re: build openjdk with fsanitizer=thread > > On Thu, Feb 27, 2020 at 7:56 AM Jie He wrote: > > > > > Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. > > > > You said tsan is not supposed to produce any false positives according to c/c++ rules, but the papers about dynamic data detection like djit+, lockset, they all said false positive is unavoidable, even using pure happens-before mode because the algorithm doesn't understand the synchronization semantic indeed. And also the paper "ThreadSanitizer ? data race detection in practice" lists some false positive cases, even a simple producer-consumer model. > > > > And thanks your explanation and patience again. > > Lockset, yes, it inherently does not understand lock-free synchronization and atomic operations. > The ThreadSanitizer paper is about (very) old version of ThreadSanitizer that was valgrind-based and is long dead. That old TSan could work in 2 modes: lock-set and happens-before. Lock-set mode indeed had lots of false positives, that's probably what the paper refers to. The happens-before was better but had a limitation related to Valgrind binary instrumentation -- it's not possible to see/intercept atomic operations on binary level. So it requires annotations for atomic operations. > The current TSan is only happens-before-based and is based on compiler instrumentation, so it understands std::atomic and __atomic_ builtins (and C atomics as consequence I think). So it does not have these classes of false positives. > > I lied a bit about the complete absence of false positives. TSan still does not handle stand-alone memory barriers (std::atomic_thread_fence), there are no fundamental limitations but it's not implemented. It still understands atomic loads and stores with fine-grained memory ordering constraints (acquire/release/relaxed). > But what I meant is: don't write off everything you see as a false positive right away and suppress it. Then there is also little point in applying TSan in the first place. It is capable of finding very tricky race conditions in tricky code, e.g. it's used extensively on > V8 and to some degree enabled switch a concurrent GC by catching a number of races between runtime/mutators. > > > > > > -----Original Message----- > > From: Dmitry Vyukov > > Sent: Thursday, February 27, 2020 2:22 PM > > To: Jie He > > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > > nd ; thread-sanitizer > > Subject: Re: build openjdk with fsanitizer=thread > > > > On Thu, Feb 27, 2020 at 4:45 AM Jie He wrote: > > > > > > Yes, you remind me, I ever added a suppression in supp file > > > "call_from_lib:libjvm.so" to try to get a clear report, but seems it > > > didn't succeed. There are still a lot of reports produced, some > > > threads have stack trace and recognize pthread function well, but > > > others not, > > > > Oh, if the JVM is compiled to libjvm.so, this is not going to work, this literally tells tsan to ignore all pthread interceptors coming from libjvm.so. > > call_from_lib is meant only for non-instrumented libraries, because compiler instrumentation is not ignored. So what we have with this suppression is that all synchronization (pthread) is ignored but memory accesses are not. This is intended that this will produce tons of nonsensical reports. > > And even for non-instrumented libraries call_from_lib may lead to false positives if the non-instrumented library synchronizes any of instrumented code. > > call_from_lib is meant for very special contexts, generally don't use it. > > > > > > > is it correct behavior? I think no reports should be produced if I add this strong suppression. > > > > Yes. And it's not strong in all respects, only in some. > > > > > > > Then I removed the supp item, seems Mutex could be handled well, but still a lot of reports exist, I will check and suppress them if it's not a race. > > > > Tsan is not supposed to produce any false positives according to C/C++ rules (if used correctly and understand synchronization, which should be the case here as JVM seem to use pthread synchronization primitives). Suppressions are mostly to temporarily suppress known bugs for example to make CI green while somebody works on the bug fix. > > > > > > > > > Thank u. > > > > > > -----Original Message----- > > > From: Dmitry Vyukov > > > Sent: Thursday, February 27, 2020 12:12 AM > > > To: Jie He > > > Cc: Arthur Eubanks ; tsan-dev at openjdk.java.net; > > > nd ; thread-sanitizer > > > > > > Subject: Re: build openjdk with fsanitizer=thread > > > > > > This is puzzling. It seems to be protected by the PlatformMutex, which is pthread_mutex_t underneath: > > > https://github.com/openjdk/jdk/blob/0da0333a06aef32ce7af3448dfa38c1f > > > 40 c32826/src/hotspot/os/posix/os_posix.inline.hpp > > > > > > Tsan does understand pthread_mutex_t, that's a very widely used functionality. So there is something fishy going on. > > > Do we know that it intercepts/understands at least some of pthread mutexes? Or there is some systematic problem which results in tsan missing just all of pthread? By any chance don't you have something like "called_from_lib:pthread" in your tsan.supp? > > > > > > There is also something fishy in the reports you posted. First one happens inside of MutexLocker::~MutexLocker(). The question is: why the race wasn't reported in the MutexLocker constructor where we write to owner_ as well? I don't see any reasonable execution where write in ctor does not race, but write in dtor races... > > > The same in the second report: the race is in notify method, but we also called lock before which write to owner_. Why that did not race?... > > > > > > > > > > > > > > > On Wed, Feb 26, 2020 at 3:11 PM Jie He wrote: > > > > > > > > Oh, I misunderstand the race object, but _owner also appears to be > > > > protected by the member var _lock within class Mutex. > > > > > > > > -----Original Message----- > > > > From: Dmitry Vyukov > > > > Sent: Wednesday, February 26, 2020 9:56 PM > > > > To: Jie He > > > > Cc: Arthur Eubanks ; > > > > tsan-dev at openjdk.java.net; nd ; thread-sanitizer > > > > > > > > Subject: Re: build openjdk with fsanitizer=thread > > > > > > > > On Wed, Feb 26, 2020 at 2:46 PM Jie He wrote: > > > > > > > > > > Is this case you mentioned "_owned_by_self" > > > > > > > > > > > #0 Mutex::owned_by_self() const /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutex.cpp:301:10 (libjvm.so+0x1925966) > > > > > > #1 assert_lock_strong(Mutex const*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.cpp:187:13 (libjvm.so+0x19272fa) > > > > > > #2 MutexLocker::~MutexLocker() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/mutexLocker.hpp:237:7 (libjvm.so+0x33da8a) > > > > > > #3 G1YoungRemSetSamplingThread::stop_service() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1YoungRemSetSamplingThread.cpp:129:1 (libjvm.so+0x10c50e0) > > > > > > #4 ConcurrentGCThread::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/shared/concurrentGCThread.cpp:65:3 (libjvm.so+0xc9fa11) > > > > > > #5 G1CollectedHeap::stop() /home/wave/workspace/jdk_master/src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1867:31 (libjvm.so+0xfae058) > > > > > > #6 before_exit(JavaThread*) /home/wave/workspace/jdk_master/src/hotspot/share/runtime/java.cpp:461:21 (libjvm.so+0x1215b84) > > > > > > #7 Threads::destroy_vm() /home/wave/workspace/jdk_master/src/hotspot/share/runtime/thread.cpp:4418:3 (libjvm.so+0x1e326ee) > > > > > > #8 jni_DestroyJavaVM_inner(JavaVM_*) /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:3989:7 (libjvm.so+0x137a5eb) > > > > > > #9 jni_DestroyJavaVM /home/wave/workspace/jdk_master/src/hotspot/share/prims/jni.cpp:4007:14 (libjvm.so+0x137a3ef) > > > > > > #10 JavaMain > > > > > > > > > > Seems thread::current() is a TLS variable, refer to the code: > > > > > > > > > > inline Thread* Thread::current_or_null() { #ifndef > > > > > USE_LIBRARY_BASED_TLS_ONLY > > > > > return _thr_current; > > > > > #else > > > > > if (ThreadLocalStorage::is_initialized()) { > > > > > return ThreadLocalStorage::thread(); > > > > > } > > > > > return NULL; > > > > > #endif > > > > > } > > > > > > > > The race is on Mutex::owner_ based on the fact that another thread does a write, and on allocation stack, and the race just seems to be visible in the code and is a common race.