RFR: 8266963: Reentrance condition for safepoint/handshake

patricio.chilano.mateo at oracle.com patricio.chilano.mateo at oracle.com
Mon May 17 14:57:13 UTC 2021


On 5/15/21 5:04 AM, Yude Lin wrote:
> On Fri, 14 May 2021 19:49:10 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:
>
>> We cannot do a return here because a safepoint could be already in progress after transitioning out of the blocked state. The handshake would then execute concurrently with the safepoint operation which is not allowed.
> I didn't get this part. Being able to return means that we are already in another enclosing SafepointMechanism::process_if_requested_slow(). This enclosing SafepointMechanism::process_if_requested_slow() should make sure we are processing the handshake safely, right?
The issue is that inside the handshake closure you would transition to 
the blocked state in ThreadBlockInVM(), which allows a safepoint to 
proceed. If in ~ThreadBlockInVM() we don't stop for the safepoint and 
just return in SafepointMechanism::process_if_requested_slow() then now 
you would have a safepoint and handshake executing at the same time.
If we want to keep the ThreadBlockInVM in 
GenerateOopMap::do_interpretation() we need to either avoid calling it 
while inside a handshake closure (by moving the check further up as you 
try to do in your other version), or we move the check further down to 
after honoring the safepoint in SafepointSynchronize::block() (as with 
the old flag we used to have). The latter implies answering the question 
of whether it is even safe to allow safepoints in the first place and 
then resume the handshake, which again I don't think it is. Avoid 
polling is a straightforward solution for this issue and goes in line 
with the NoSafepointVerifier checks that we use in handshake.cpp.
Or as Robbin pointed out we could back out 8262443 altogether.  : )

> It's indeed release build. But fastdebug build miraculously runs without any hang or crash. Maybe it took a different path.
>
> While I was figuring out why the debug build won't crash. I found that the condition `i != 0` in do_interpretation (if I understand correctly, it's just a spin count) could be hiding the reentrance problem. If I make the change
>
> diff --git a/src/hotspot/share/oops/generateOopMap.cpp b/src/hotspot/share/oops/generateOopMap.cpp
> index 06ae6b0dbaf..8048aa92fc6 100644
> --- a/src/hotspot/share/oops/generateOopMap.cpp
> +++ b/src/hotspot/share/oops/generateOopMap.cpp
> @@ -911,7 +911,7 @@ void GenerateOopMap::do_interpretation(Thread* thread)
>   {
>     int i = 0;
>     do {
> -    if (i != 0 && thread->is_Java_thread()) {
> +    if (thread->is_Java_thread()) {
>         JavaThread* jt = thread->as_Java_thread();
>         if (jt->thread_state() == _thread_in_vm) {
>           // Since this JavaThread has looped at least once and is _thread_in_vm,
>
>
>
> I get
>
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  Internal Error (/home/yude.lyd/jdk-master/src/hotspot/share/runtime/mutex.cpp:407), pid=122250, tid=123348
> #  assert(false) failed: Attempting to acquire lock tty_lock/3 out of order with lock stack_watermark_lock/2 -- possible deadlock
> #
> # JRE version: OpenJDK Runtime Environment (17.0) (fastdebug build 17-internal+0-adhoc.yudelyd.jdk-master)
> # Java VM: OpenJDK 64-Bit Server VM (fastdebug 17-internal+0-adhoc.yudelyd.jdk-master, mixed mode, sharing, compressed oops, compressed class ptrs, shenandoah gc, linux-amd64)
> # Problematic frame:
> # V  [libjvm.so+0x1485e50]  Mutex::check_rank(Thread*)+0x120
> #
> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # If you would like to submit a bug report, please visit:
> #   https://bugreport.java.com/bugreport/crash.jsp
> #
Yes, this is a similar issue. We cannot poll for safepoints while 
processing a watermark also. The rank check is just a secondary issue of 
trying to grab tty_lock while crashing. So backing out 8262443 would 
solve that too.

Thanks,
Patricio
> ---------------  S U M M A R Y ------------
>
> Command Line: -Xmx24g -Xms24g -XX:ParallelGCThreads=16 -XX:+UseShenandoahGC -XX:-TieredCompilation -Xlog:gc*=debug,handshake=trace:file=510s.log:tid:filesize=200m SPECjvm2008.jar -ict -coe -i 5 derby
>
> XXXX, Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz, 96 cores, 503G, XXXX
> Time: Sat May 15 15:32:28 2021 CST elapsed time: 101.087584 seconds (0d 0h 1m 41s)
>
> ---------------  T H R E A D  ---------------
>
> Current thread (0x00007f5ca802ecc0):  JavaThread "BenchmarkThread derby 51" [_thread_in_vm, id=123348, stack(0x00007f5b8b4f5000,0x00007f5b8b5f6000)]
>
> Stack: [0x00007f5b8b4f5000,0x00007f5b8b5f6000],  sp=0x00007f5b8b5f1f20,  free space=1011k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V  [libjvm.so+0x1485e50]  Mutex::check_rank(Thread*)+0x120
> V  [libjvm.so+0x14867f1]  Mutex::lock_without_safepoint_check(Thread*)+0x51
> V  [libjvm.so+0x1541090]  defaultStream::hold(long)+0xa0
> V  [libjvm.so+0x154124a]  defaultStream::write(char const*, unsigned long)+0x2a
> V  [libjvm.so+0x153dc30]  outputStream::do_vsnprintf_and_write_with_automatic_buffer(char const*, __va_list_tag*, bool)+0xf0
> V  [libjvm.so+0x153e97f]  outputStream::print_cr(char const*, ...)+0x1bf
> V  [libjvm.so+0x1981063]  JavaThread::check_possible_safepoint()+0x63
> V  [libjvm.so+0xd2f838]  GenerateOopMap::do_interpretation(Thread*)+0x238
> V  [libjvm.so+0xd2fe0e]  GenerateOopMap::compute_map(Thread*)+0x3ae
> V  [libjvm.so+0x150d243]  OopMapForCacheEntry::compute_map(Thread*)+0x163
> V  [libjvm.so+0x150eef5]  OopMapCacheEntry::fill(methodHandle const&, int)+0xf5
> V  [libjvm.so+0x150fa40]  OopMapCache::compute_one_oop_map(methodHandle const&, int, InterpreterOopMap*)+0x60
> V  [libjvm.so+0x141d876]  Method::mask_for(int, InterpreterOopMap*)+0x96
> V  [libjvm.so+0xbd6381]  frame::oops_interpreted_do(OopClosure*, RegisterMap const*, bool) const+0x4c1
> V  [libjvm.so+0x185073e]  StackWatermarkFramesIterator::process_one(void*)+0x20e
> V  [libjvm.so+0x18515f8]  StackWatermark::process_one()+0x88
> V  [libjvm.so+0x18526e9]  StackWatermarkSet::on_iteration(JavaThread*, frame const&)+0x89
> V  [libjvm.so+0xbd98ca]  frame::sender(RegisterMap*) const+0x7a
> V  [libjvm.so+0x167f988]  check_compiled_frame(JavaThread*)+0x88
> V  [libjvm.so+0x168397a]  OptoRuntime::new_instance_C(Klass*, JavaThread*)+0xfa
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/4028



More information about the hotspot-runtime-dev mailing list