Safepointing & Locking issue

Thu Jul 8 00:51:00 PDT 2010

Hi Nick,

By default monitor deflation only occurs as a side-effect of a 
safepoint, so if a lot of safepoints are ocurring there is an underlying 
reason for that. It may be that using the latest VM and 
-XX:+PrintSafepointStatistics will shed some light on that.

As to the inflate-deflate-inflate cycle you may indeed be seeing a 
pathological case, or it may be that too eager a cleanup of the inflated 
monitor leads to the re-inflation.

Cheers,
David Holmes

Nicolas Michael said the following on 07/08/10 17:21:
> Hi all,
> 
> I have yesterday been investigating a performance problem with one of our
> applications where I observed some "strange" JVM behavior (which might be as
> expected, but I'm not sure...).
> 
> It's a heavily multi-threaded Java application with about 300 threads, running
> on a 4 CPU Solaris x86 box using a 32bit HotSpot Client VM build 16.3-b01 (JDK
> 1.6.0_20-b02). It seems to make extensive use of synchronization and suffers
> badly from lock contention. Thread #10 (the VM Thread) eats close to one CPU,
> mostly doing lock deflations during safepoints in stacks like this:
> 
>               libjvm.so`__1cNObjectMonitorHis_busy6kM_i_+0x26
>               libjvm.so`__1cSObjectSynchronizerVdeflate_idle_monitors6F_v_+0x210
>               libjvm.so`__1cUSafepointSynchronizeQdo_cleanup_tasks6F_v_+0x19
>               libjvm.so`__1cUSafepointSynchronizeFbegin6F_v_+0x3d0
>               libjvm.so`__1cIVMThreadEloop6M_v_+0x26c
>               libjvm.so`__1cIVMThreadDrun6M_v_+0x88
>               libjvm.so`java_start+0xf9
>               libc.so.1`_thr_setup+0x4e
>               libc.so.1`_lwp_start
>               287
> 
>>From safepointing point-of-view, the process isn't doing much useful work -- it
> only got short periods of work interrupted by safepoints that take longer than
> the actual application runtime -- at a rate of 1709 safepoints per second (see
> below):
> 
> Application time: 0.0000450 seconds
> Total time for which application threads were stopped: 0.0002397 seconds
> Application time: 0.0001607 seconds
> Total time for which application threads were stopped: 0.0002603 seconds
> Application time: 0.0000469 seconds
> Total time for which application threads were stopped: 0.0011530 seconds
> ..
> 
>>From the perfdata statistics, I see
> sun.rt._sync_ContendedLockAttempts    increasing by   12 / sec
> sun.rt._sync_Deflations               increasing by  477 / sec
> sun.rt._sync_Inflations               increasing by  477 / sec
> sun.rt._sync_Notifications            increasing by  791 / sec
> sun.rt._sync_Parks                    increasing by  795 / sec
> sun.rt.safepoints                     increasing by 1709 / sec
> sun.rt.safepointTime is 0.82s and sun.rt.safepointSyncTime 0.15s per second.
> 
> I guess we're using too much synchronization, Object.wait(ms) and notify() (or
> even notifyAll()) ... ;-) but I'm wondering whether it was "normal" to have that
> many safepoints and lock inflations/deflations because of that?
> 
> My understanding was that safepoints are mainly caused by GC (not the case here)
> or deoptimization (should be visible with +LogCompilation, right? but nothing
> there either). Or bias revocation. Indeed, we do have some bias revocations
> (like this, see below), but this is at a much lower rate (only ~ 3.4 times per
> second):
> 
> <writer thread='10'/>
> Revoking bias with potentially per-thread safepoint:
> Revoking bias of object 0x6df887d0 , mark 0x08ee3805 , type
> org.jacorb.poa.RequestQueue , prototype header 0x00000005 , allow rebias 0 ,
> requesting thread 0x09065800
>   Revoked bias of currently-unlocked object
> <writer thread='149'/>
> Revoking bias by walking my own stack:
> Revoking bias of object 0x6df887c8 , mark 0x09065805 , type java.lang.Object ,
> prototype header 0x00000005 , allow rebias 0 , requesting thread 0x09065800
>   Revoked bias of currently-locked object
> Inflating object 0x6df887c8 , mark 0x0906042a , type java.lang.Object
> 
> Also, disabling biased locking has no effect on the magnitude of safepointing,
> so I concluded that safepointing must then be caused by lock inflation/deflation
> (which seems to be supported by the stack above).
> 
> Afaik a lock is being inflated when the leight-weight locking algorithm can't
> acquire the lock immediately ("contention"). It will then fall back to
> heavy-weight locking (OS mutex?) and notify (wake up) all waiting threads (?).
> We have about 40 lock inflations per contended lock attempt (see the perfdata
> statistics above), does that mean there were (on average) 40 threads waiting on
> that contended lock?
> 
> Since we have the same amount of deflations as we have inflations, I would
> assume that each heavy-weight lock is shortly after being "contended" reverted
> to a leight-weight lock again (during a safepoint in deflate_idle_monitors() as
> in the stack above??). But why? Couldn't it stay as a heavy-weight lock, instead
> of being deflated, only to be inflated shortly after again?
> 
> In the example below, lock 0x96568a60 is being inflated, deflated, inflated
> again and then deflated again:
> 
> <writer thread='143'/>
> Inflating object 0x96568a60 , mark 0x08677ffe , type org.jacorb.poa.RequestProcessor
> <writer thread='10'/>
> Deflating object 0x96568a60 , mark 0x08677ffe , type org.jacorb.poa.RequestProcessor
> <writer thread='143'/>
> Inflating object 0x96568a60 , mark 0x0918f4fe , type org.jacorb.poa.RequestProcessor
> <writer thread='240'/>
> Inflating object 0x79bceee8 , mark 0x08c0b662 , type java.lang.Object
> <writer thread='243'/>
> Inflating object 0x79bceeb0 , mark 0x0914e9ee , type org.jacorb.orb.ReplyReceiver
> <writer thread='10'/>
> Deflating object 0x79bceee8 , mark 0x08c0b662 , type java.lang.Object
> Deflating object 0x79bceeb0 , mark 0x0914e9ee , type org.jacorb.orb.ReplyReceiver
> Deflating object 0x96568a60 , mark 0x0918f4fe , type org.jacorb.poa.RequestProcessor
> 
> I was just curious whether this is the intended behavior in case of lock
> contention, or whether we're seing some pathological strange thing happen here.
> 
> Btw, I've also ran the workload with Hotspot 19.0-b03 from the latest JDK 7
> build with the same results.
> 
> Any comments appreciated!
> 
> Thanks,
> Nick.
>