RFR 8150689: Thread dump report "waiting to re-lock in wait()" shows incorrectly

Daniel D. Daugherty daniel.daugherty at oracle.com
Wed Nov 21 20:01:44 UTC 2018


On 11/20/18 5:34 PM, David Holmes wrote:
> Hi Patricio,
>
> On 21/11/2018 3:51 am, Patricio Chilano wrote:
>> Hi all,
>>
>> Could you review this small fix for thread dump reports?
>> Now the "waiting to re-lock in wait()" message is shown in the "at 
>> java.lang.Object.wait(Native Method)" frame where the re-locking is 
>> actually occurring. In the frame where the lock was first taken we 
>> always show "locked". Please check if there is some scenario where 
>> you think the lock info on the stack is not being printed as you 
>> would expect.
>
> Can you please update the bug report with examples of how the stack 
> report now looks compared to those used to discuss the bug.
>
> I applied the patch and ran my examples and they looked good!
>
>> I also noticed the _thread_state attribute was printed twice so I 
>> removed one instance. I don't know if there was a specific reason to 
>> print it twice so I can add it back.
>
> ThreadSafepointState::print_on is called from both the stack dump code 
> (where you see _thread_state twice) and safepoint logging. It's useful 
> to also see the thread state during safepoint logging so please undo 
> this change.

I should have read thru David's review before posting mine...

Can we think of a way to get this line just once in both contexts?


> src/hotspot/share/runtime/vframe.cpp
>
> 182 if(java_lang_Thread::get_thread_status(thread()->threadObj()) == 
> java_lang_Thread::BLOCKED_ON_MONITOR_ENTER){
>
> Nit: need space after "if", need space after ")"
> Nit please line-break after == and indent appropriately
>
> This seems such an obvious check I have to really question why we did 
> not do this before,

Agreed. I have no idea why we didn't think of this back with JDK-8130448.
That was my bug so it my screw up. Sorry about that.

> and had the comment that the two states could not be distinguished. :)

I researched where that comment came from and here's what I found:

> src/share/vm/runtime/vframe.cpp
>
> 170       // It would be nice to distinguish between "waiting on" and
> 171       // "waited on". Currently, "waiting on" here with a
> 172       // java.lang.Thread.State == "WAITING (on object monitor)"
> 173       // earlier in the output means that the monitor has not yet 
> been
> 174       // notified and java.lang.Thread.State == "BLOCKED (on object
> 175       // monitor)" earlier in the output means that the monitor has
> 176       // been notified and the thread is blocked on reentry.
>
> That's a very long sentence that can be quite difficult to parse and 
> could probably be reworded to describe how the output should be 
> interpreted eg:
>
> // If earlier in the output we reported java.lang.Thread.State ==
> // "WAITING (on object monitor)" and now we report "waited on", then
> // we are still waiting for notification [or timeout?]. Otherwise if
> // we earlier reported java.lang.Thread.State == "BLOCKED (on object
> // monitor)", then we are actually waiting to reacquire the monitor
> // lock. At this level we can't distinguish the two cases to report
> // "waited on" rather than "waiting on" for the second case.
>
> I wonder if we can prod deeper into VM state to try and make the 
> distinction? 

So I originally had L170-L176 and you suggested the rewrite that
is below it. Looks like I pretty much went with your rewrite, but
you did also ask this question:

 > I wonder if we can prod deeper into VM state to try and make the 
distinction?

and it looks like I never chased down the answer to that query.
Fortunately, Patricio has figured it out.

Dan


>
> I think what you have done simplifies things nicely, and as I said my 
> two main examples now work fine.
>
>> Webrev URL: http://cr.openjdk.java.net/~pchilanomate/8150689.01/webrev
>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8150689
>
> Please always advise what testing you did. In this case I think we 
> have a problem with a test, but it seems to be a pre-existing problem:
>
> serviceability/tmtools/jstack/WaitNotifyThreadTest.java
>
> Seems to expect to find a "waiting to re-lock" line but in fact I 
> can't see anything in the test that actually creates a situation where 
> the thread would be waiting to re-lock! Further the test is supposed 
> to print out:
>
> System.out.println("Correct monitor info found");
>
> yet there is no such output. This indicates that the test is not 
> working as expected and is not actually finding any locks. I think the 
> logic for checking the method "name" is broken e.g.
>
>      if (mi.getName().startsWith("WaitThread.run")) {
>
> the actual name is WaitNotifyThreadTest$WaitThread.run so it won't match!
>
> Also:
>
> if (mi.getName().startsWith(OBJECT_WAIT) && mi.getCompilationUnit() == 
> null /*native method*/) {
>
> won't match anything either as far as I can see!
>
> <sigh> I filed a bug for that test: JDK-8214148
>
> Thanks,
> David
>
>> Thanks,
>> Patricio



More information about the hotspot-runtime-dev mailing list