RFR 8150689: Thread dump report "waiting to re-lock in wait()" shows incorrectly

Daniel D. Daugherty daniel.daugherty at oracle.com
Thu Nov 22 00:18:27 UTC 2018


On 11/21/18 5:45 PM, Patricio Chilano wrote:
> Hi Dan,
>
> On 11/21/18 2:52 PM, Daniel D. Daugherty wrote:
>> On 11/20/18 12:51 PM, Patricio Chilano wrote:
>>> Hi all,
>>>
>>> Could you review this small fix for thread dump reports?
>>> Now the "waiting to re-lock in wait()" message is shown in the "at 
>>> java.lang.Object.wait(Native Method)" frame where the re-locking is 
>>> actually occurring. In the frame where the lock was first taken we 
>>> always show "locked". Please check if there is some scenario where 
>>> you think the lock info on the stack is not being printed as you 
>>> would expect.
>>>
>>> I also noticed the _thread_state attribute was printed twice so I 
>>> removed one instance. I don't know if there was a specific reason to 
>>> print it twice so I can add it back.
>>>
>>> Webrev URL: http://cr.openjdk.java.net/~pchilanomate/8150689.01/webrev
>>
>> src/hotspot/share/runtime/safepoint.cpp
>>     new L1122:   _thread->print_thread_state_on(st);
>>         L1123:   st->print_cr("Thread: " INTPTR_FORMAT
>>         L1124:               "  [0x%2x] State: %s _has_called_back %d 
>> _at_poll_safepoint %d",
>>         L1125:                p2i(_thread), 
>> _thread->osthread()->thread_id(), s, _has_called_back,
>>         L1126:                _at_poll_safepoint);
>>     old L1127:   _thread->print_thread_state_on(st);
>>
>>         The reason for moving the line isn't jumping out at me (yet).
>>         Update: see next file's comments.
>>
>> src/hotspot/share/runtime/thread.cpp
>>         L2945: #ifndef PRODUCT
>>     old L2946:   print_thread_state_on(st);
>>         L2947:   _safepoint_state->print_on(st);
>>         L2948: #endif // PRODUCT
>>
>>         Okay so now I understand the change to safepoint.cpp. I agree
>>         with deleting L2946 from thread.cpp because it would result in
>>         two print_thread_state_on() calls in non-PRODUCT bits.
>>
>>         However, I don't agree with moving the print_thread_state_on()
>>         call in safepoint.cpp. If you do that then you'll change the
>>         output order for the "Thread: ..." line to be after the
>>         print_thread_state_on() output for all build configs.
> We could keep it in the same order in ThreadSafepointState::print_on() 
> and if JavaThreadState and ThreadStatus need to be printed together 
> maybe we could add a flag to decide whether to print the 
> JavaThreadState in ThreadSafepointState::print_on(). (?)
>
>> So I just took a look at the output for baseline release bits
>>         on my MBP and I'm surprised to see this:
>>
>> "main" #1 prio=5 os_prio=31 cpu=121.30ms elapsed=1.41s 
>> tid=0x00007f8abb808800 nid=0x2003 waiting on condition 
>> [0x0000000100aff000]
>>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>>    JavaThread state: _thread_blocked
>> Thread: 0x00007f8abb808800  [0x2003] State: _at_safepoint 
>> _has_called_back 0 _at_poll_safepoint 0
>>    JavaThread state: _thread_blocked
>>     at java.lang.Thread.sleep(java.base at 12-internal/Native Method)
>>     at Sleeper.main(Sleeper.java:5)
>>
>>         So even in baseline 'release' bits, I'm seeing two of these:
>>
>>            JavaThread state: _thread_blocked
>>
>>         I don't know what to think here. Can you explain why release
>>         bits are also showing double print_thread_state_on() output?
>>         Maybe I'm not understanding about the thread.cpp code.
> Odd, I did the same test in my MBP for a "release" build but I don't 
> see the extra output. I also tested it for Linux and it doesn't show 
> the extra output either. Can you confirm this?

I just reran my Sleeper.java and did a CTRL-\ thread dump on
both Linux and MacOS with release bits. This time I did not
get the extra output with either...

I just scrolled back thru my window to see what I did... I
set JAVA_HOME to the release bits and then I pasted the full
path to the fastdebug bits on my MBP... So instead of using
$JAVA_HOME/bin/java Sleeper... I ran the wrong version...

Sorry for the confusion.

Dan


>
>> So I went back to JDK8 release bits on my Solaris box:
>>
>> "main" #1 prio=5 os_prio=64 tid=0x000000000041e800 nid=0x2 waiting on 
>> condition [0xffff80ffbf1ae000]
>>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>>         at java.lang.Thread.sleep(Native Method)
>>         at Sleeper.main(Sleeper.java:5)
>>
>>         So there's no "JavaThread state: _thread_blocked" line at all.
>>         I'm guessing we added that to 'release' bits output after JDK8.
>>
>>         And here's the JDK8 fastdebug bits on my Solaris box:
>>
>> "main" #1 prio=5 os_prio=64 tid=0x0000000000421800 nid=0x2 waiting on 
>> condition [0xffff80ffbf1ae000]
>>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>>    JavaThread state: _thread_blocked
>> Thread: 0x0000000000421800  [0x 2] State: _at_safepoint 
>> _has_called_back 0 _at_poll_safepoint 0
>>    JavaThread state: _thread_blocked
>>         at java.lang.Thread.sleep(Native Method)
>>         at Sleeper.main(Sleeper.java:5)
>>
>>         Short version: I agree that there are two "JavaThread state: 
>> _thread_blocked"
>>         and we should fix that. I have no preference on whether the 
>> line is before or
>>         after the "Thread: ..." line. Others may have an opinion.
>>
>> src/hotspot/share/runtime/vframe.cpp
>>     I agree with the comment rewrite on L174-6.
> I actually kept thinking about the comment I added in lines L174-L176, 
> but now I think it's not completely true. Because although that window 
> of times exists there is no safepoint check inside it so the VMthread 
> would never be able to stop the JavaThread during that time. Maybe I 
> should remove that comment?
>
>>     I like this new code:
>>
>>     L182: 
>> if(java_lang_Thread::get_thread_status(thread()->threadObj()) == 
>> java_lang_Thread::BLOCKED_ON_MONITOR_ENTER){
>>     L183:             wait_state = "waiting to re-lock in wait()";
>>     L184:           }
>>         Needs space between 'if' and '(' and between ')' and '{'.
> Done!
>>
>>     I agree with deleting old L232->250. That's the code that was
>>     causing the confusing "waiting to re-lock in wait()" output
>>     in the frame(s) after frame 0.
>>
>> Thumbs up!
>>
>> As David says, having some example current and new output in the bug 
>> report
>> would help.
> Done!
>
>
> Thanks!
> Patricio
>
>> Dan
>>
>>
>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8150689
>>>
>>> Thanks,
>>> Patricio
>>
>
>



More information about the hotspot-runtime-dev mailing list