RFR: 8210832: Remove sneaky locking in class Monitor

Tue Jan 29 16:00:16 UTC 2019

Hi Patricio,

I'm stripping this down to just the possible issue in mutex.cpp...

On 1/28/19 6:13 PM, Patricio Chilano wrote:
> On 1/28/19 4:29 PM, Daniel D. Daugherty wrote:
>> On 1/28/19 2:18 PM, Patricio Chilano wrote:
>>> Full: http://cr.openjdk.java.net/~pchilanomate/8210832/v02/webrev
>> src/hotspot/share/runtime/mutex.cpp
>> L205:         jt->java_suspend_self();
>>     L206:       }
>>     L207:     }
>>     L208:
>>     L209:     if (in_flight_monitor != NULL) {
>>     L210:       // Conceptually reestablish ownership of the lock.
>>     L211:       assert(_owner == NULL, "should be NULL: owner=" 
>> INTPTR_FORMAT, p2i(_owner));
>>     L212:       set_owner(self);
>>     L213:     } else {
>>     L214:       lock();
>>     L215:     }
>>         The lock reacquire on L214 used to be done after the
>>         java_suspend_self() on L205 which is inside the block
>>         context for the ThreadLockBlockInVM and OSThreadWaitState
>>         helps. If the lock() blocks due to a racing thread, then
>>         the calling JavaThread won't have the right thread state
>>         of OS thread wait state, etc...
>> So after java_suspend_self(), you have to re-lock() so that
>>         a potential block on that re-lock has the right states. This
>>         also means that this line has to be deleted:
>>
>>         L204         in_flight_monitor = NULL;
>>
>>         so that ThreadLockBlockInVM destructor can do the right thing
>>         if the thread is suspended and then relocks. 
> Yes, I see your point. The problem is that after executing the TLBIVM 
> destructor (which executes after the OSThreadWaitState destructor with 
> the current order)  there is always the possibility that we had to 
> release the lock, and so afterwards we will have to re-acquire it with 
> a different state. One simple way of solving this could be moving the 
> OSThreadWaitState object outside the TLBIVM block. Based on David's 
> comment about OSThreadWaitState I don't think changing the order 
> should break things, since it seems more like a debugging tool. What 
> do you think then of doing something like this: (I also included a 
> re-lock after java_suspend_self() and removed in_flight_monitor=NULL 
> as you suggested.)
>
> diff --git a/src/hotspot/share/runtime/mutex.cpp 
> b/src/hotspot/share/runtime/mutex.cpp
> --- a/src/hotspot/share/runtime/mutex.cpp
> +++ b/src/hotspot/share/runtime/mutex.cpp
> @@ -182,9 +186,9 @@
>      JavaThread *jt = (JavaThread *)self;
>      Monitor* in_flight_monitor = NULL;
>
> +    OSThreadWaitState osts(self->osthread(), false /* not 
> Object.wait() */);
>      {
>        ThreadLockBlockInVM tlbivm(jt, &in_flight_monitor);
> -      OSThreadWaitState osts(self->osthread(), false /* not 
> Object.wait() */);
>        if (as_suspend_equivalent) {
>          jt->set_suspend_equivalent();
>          // cleared by handle_special_suspend_equivalent_condition() or
> @@ -201,8 +205,8 @@
>          // want to hold the lock while suspended because that
>          // would surprise the thread that suspended us.
>          _lock.unlock();
> -        in_flight_monitor = NULL;
>          jt->java_suspend_self();
> +        _lock.lock();
>        }
>      }
>
>> Also after lock() on L214, you never call set_owner(self)
>>         so the ownership is not complete for that relocated code.
>>         You won't need the set_owner(self) call if you move the
>>         lock() on L214 back to after java_suspend_self().
> That lock() is actually Monitor::lock(). But I agree is confusing, 
> every time I look at it I say the same. Maybe I can rewrite it as 
> Monitor::lock() ?

The fact that the lock() on L214 is really Monitor::lock()
changes everything with respect to my comment. I'm sorry
I missed that crucial detail. Please ignore the above from
me and undo the changes that I suggested.

Here's a new set of comments:

     L195:       in_flight_monitor = this;
         Please consider adding a comment:

                 in_flight_monitor = this;  // save for ~ThreadLockBlockInVM

     L204:         in_flight_monitor = NULL;
         Please consider adding a comment:

                   in_flight_monitor = this;  // ~ThreadLockBlockInVM 
does not need to unlock

     L209:     if (in_flight_monitor != NULL) {
         Please consider adding a comment below L209:

                 // Not unlocked by ~ThreadLockBlockInVM or self-suspend 
above.

     L214:       lock();
         I missed the fact that this is Monitor::lock(). However, the
         problem with Monitor::lock() here is that if this call blocks,
         then we don't have a ThreadLockBlockInVM to change our state
         so we won't be recognized as blocking.

         Please consider changing to:

                 Monitor::lock(self);

         that will give us full locking semantics.

Okay, I think this new set of comments is on the right track, but
obviously I need other reviewers to point out my silliness... :-)

I am a bit worried about whether our use of a ThreadLockBlockInVM
in wait() and another in lock(self) on this code path could cause
any confusion by observing threads. I don't think so, but I haven't
traced it all out from beginning to end yet.

Dan