Multiple Java threads seem to have locked the same object monitor from the thread dumps

David Holmes david.holmes at oracle.com
Thu Jul 12 14:34:46 PDT 2012


On 12/07/2012 10:48 PM, Krystal Mok wrote:
 > Wonder what would go wrong if we set the current pending monitor
 > before the safepoint callback in ObjectMonitor::enter()?

:) I wonder too. It might be that nothing goes wrong. Then again it 
might not go wrong till the changes ship in product code and a 
particular customer updates to start using it and they start getting 
"blue moon" crashes.

The state transition logic in the VM - and moving things around those 
transitions - can be very fragile, and very subtle. I'm much more 
inclined to investigate why we can't actually check the owner to confirm 
it is indeed "locked".

BTW do you have a reproducer for this?

Cheers,
David
-----

> Hi David,
>
> Comments inline:
>
> On Thu, Jul 12, 2012 at 7:53 PM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
>     Hi Kris,
>
>     The scenario you outline seems plausible. There are numerous
>     inherent races when it comes to this kind of observation of thread
>     state because the changes of state are not atomically performed (and
>     can not be). The inference logic for "have I locked this monitor" is
>     quite simple and I did fix one bug in this area about 6 years ago.
>
> javaVFrame::locked_monitors() seems to be affected as well. It's using
> the same logic to check whether a monitor is locked by a specific thread
> or not.
>
>       if (monitor->owner() != NULL) {
>          // First, assume we have the monitor locked. If we haven't found an
>          // owned monitor before and this is the first frame, then we
>     need to
>          // see if we have completed the lock or we are blocked trying to
>          // acquire it - we can only be blocked if the monitor is inflated
>
>          const char *lock_state = "locked"; // assume we have the
>     monitor locked
>          if (!found_first_monitor && frame_count == 0) {
>            markOop mark = monitor->owner()->mark();
>            if (mark->has_monitor() &&
>                mark->monitor() == thread()->current_pending___monitor()) {
>              lock_state = "waiting to lock";
>            }
>          }
>
>          found_first_monitor = true;
>          print_locked_object_class___name(st, monitor->owner(), lock_state);
>        }
>     }
>
>     Looking at this now the obvious (and hence undoubtedly wrong :) )
>     solution would seem to be add a check for the owner() against the
>     thread().
>
> Or perhaps something like:
>
> if (monitor->owner() != NULL) {
>
>    // First, assume we have the monitor locked. If we haven't found an
>    // owned monitor before and this is the first frame, then we need to
>    // see if we have completed the lock or we are blocked trying to
>    // acquire it - we can only be blocked if the monitor is inflated
>
>    const char *lock_state = "locked"; // assume we have the monitor locked
>    if (!found_first_monitor && frame_count == 0) {
>      markOop mark = monitor->owner()->mark();
>      if (mark->has_monitor()) {
>        void* owner = mark->monitor()->owner();
>        if (owner != thread() && thread()->is_lock_owned((address) owner)) {
>          lock_state = "waiting to lock";
>        }
>      }
>    }
>
>    found_first_monitor = true;
>    print_locked_object_class_name(st, monitor->owner(), lock_state);
> }
>
> And optionally factoring the predicate to a new function:
>
> bool ObjectMonitor::is_owned_by(Thread* thread) {
>    return _owner == thread || thread->is_lock_owned((address) _owner);
> }
>
> The difference of this new function with ObjectMonitor::check(TRAPS) is
> that this one doesn't throw IMSX.
>
> Wonder what would go wrong if we set the current pending monitor before
> the safepoint callback in ObjectMonitor::enter()?
>
> - Kris
>
>     David
>     -----
>
>
>     On 12/07/2012 8:55 PM, Krystal Mok wrote:
>
>         Hi,
>
>         Someone sent me a couple of thread dumps, and what's strange in
>         them is
>         that there are multiple threads claiming to have locked the same
>         object
>         monitor.
>         There's at most only one of them actually in the
>         (java.lang.Thread.State-level) RUNNABLE state, which is correct. The
>         ones that claim to have locked but are in BLOCKED state apparent
>         haven't
>         really acquired the lock yet.
>         Excerpts of the dumps are available here: [1]
>
>         There are two other discussion threads on this topic [2][3].
>
>         Reading through the related code [4], the only scenario that could
>         result in such thread dump seems to be:
>         1. The VM is going into a safepoint caused by the VM_PrintThreads VM
>         operation;
>         2. A thread is (re-)entering a synchronized block with
>         ObjectMonitor::enter().
>             It has already set the correct thread state at Java-level and
>         OSThread-level, and then it gets blocked in the safepoint callback
>         before setting the current pending monitor;
>         3. VM_PrintThreads eventually calls
>         javaVFrame::print_lock_info___on(),
>         where it checks if it's the first frame and the monitor matches the
>         thread's current pending one.
>             Because the current pending monitor wasn't set yet, this
>         check fails,
>         and the lock state message defaults to "locked".
>
>         I wonder if such behavior implies we should set the pending monitor
>         before the safepoint callback.
>
>         And, could there be other scenarios that would produce such
>         thread dumps?
>
>         Regards,
>         Kris
>
>         [1]: https://gist.github.com/__3097193#file_thread_dumps.md
>         <https://gist.github.com/3097193#file_thread_dumps.md>
>         [2]:
>         http://stackoverflow.com/__questions/9536975/multiple-__java-threads-seemingly-__locking-same-monitor
>         <http://stackoverflow.com/questions/9536975/multiple-java-threads-seemingly-locking-same-monitor>
>         [3]:
>         https://forums.oracle.com/__forums/thread.jspa?messageID=__9952796
>         <https://forums.oracle.com/forums/thread.jspa?messageID=9952796>
>         [4]: https://gist.github.com/__3097193#file_notes.md
>         <https://gist.github.com/3097193#file_notes.md>
>
>


More information about the hotspot-runtime-dev mailing list