Safepointing & Locking issue
Nicolas Michael
mailing at nmichael.de
Thu Jul 8 06:12:51 PDT 2010
Hi David and Dave,
thanks a lot for your hints and background details on this issue.
Adding -XX:+PrintSafepointStatistics to the options as suggested by David gives
about 1200+ DeoptimizeFrame's per second in the LogFile:
vmop [threads: total initially_running
wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
..
168.976: DeoptimizeFrame [ 217 0 0
] [ 0 0 0 0 0 ] 0
168.976: DeoptimizeFrame [ 217 0 0
] [ 0 0 0 0 0 ] 0
168.978: DeoptimizeFrame [ 217 0 0
] [ 0 0 0 0 0 ] 0
168.978: DeoptimizeFrame [ 217 0 0
] [ 0 0 0 0 0 ] 0
168.978: DeoptimizeFrame [ 217 0 0
] [ 0 0 0 0 0 ] 0
..
The output from the various switches (as also -XX:+LogCompilation) are a bit
interleaved, but at this point in time (second 168ff) there are no other
compilation related outputs around. So is there any way to further find out why
a frame is being deoptimized? This would then lead so a safepoint, right?
Oh, and yes, this JVM does have a JVMTI agent attached to it. But as far as I
could find out until now, it should only register for GC start/end events (which
are very rare in this process).
With DTrace I just saw -- thanks to the really cool hotspot probes (some of them
seem to be new, it's the first time I see that great level of detail!) -- that
there are about 1000 Exceptions occurring per second ("ExceptionOccurred-entry")
(but less than one "ThrowNew-entry" per second). I'm not sure how ThrowNew
relates to ExceptionOccurred and what they are actually counting, and whether
1000+ of ExceptionOccurred is "a lot" (it seems quite high ;-) ). Could
exceptions also cause deoptimization?
Unfortunately we don't have a stand-alone easy to reproduce testcase for this.
These are all the _sync_* that I get:
sun.rt._sync_ContendedLockAttempts = 1625
sun.rt._sync_Deflations = 66821
sun.rt._sync_EmptyNotifications = 0
sun.rt._sync_FailedSpins = 0
sun.rt._sync_FutileWakeups = 56
sun.rt._sync_Inflations = 66870
sun.rt._sync_MonExtant = 6784
sun.rt._sync_MonInCirculation = 0
sun.rt._sync_MonScavenged = 0
sun.rt._sync_Notifications = 116109
sun.rt._sync_Parks = 117295
sun.rt._sync_PrivateA = 0
sun.rt._sync_PrivateB = 0
sun.rt._sync_SlowEnter = 0
sun.rt._sync_SlowExit = 0
sun.rt._sync_SlowNotify = 0
sun.rt._sync_SlowNotifyAll = 0
sun.rt._sync_SuccessfulSpins = 0
Thanks for pointing me to synchronizer.cpp, this sounds worth reading your
comments there!
Thanks,
Nick.
> -----Original Message-----
> From: ext Dave Dice [mailto:dave.dice at oracle.com]
> Sent: Donnerstag, 8. Juli 2010 13:57
> To: Nicolas Michael
> Cc: hotspot-runtime-dev at openjdk.java.net
> Subject: Re: Safepointing & Locking issue
>
>
> On 2010-7-8, at 3:21 AM, Nicolas Michael wrote:
>
> > Hi all,
> >
> > I have yesterday been investigating a performance problem
> with one of our
> > applications where I observed some "strange" JVM behavior
> (which might be as
> > expected, but I'm not sure...).
> >
> > It's a heavily multi-threaded Java application with about
> 300 threads, running
> > on a 4 CPU Solaris x86 box using a 32bit HotSpot Client VM
> build 16.3-b01 (JDK
> > 1.6.0_20-b02). It seems to make extensive use of
> synchronization and suffers
> > badly from lock contention. Thread #10 (the VM Thread) eats
> close to one CPU,
> > mostly doing lock deflations during safepoints in stacks like this:
> >
> > libjvm.so`__1cNObjectMonitorHis_busy6kM_i_+0x26
> >
> libjvm.so`__1cSObjectSynchronizerVdeflate_idle_monitors6F_v_+0x210
> >
> libjvm.so`__1cUSafepointSynchronizeQdo_cleanup_tasks6F_v_+0x19
> > libjvm.so`__1cUSafepointSynchronizeFbegin6F_v_+0x3d0
> > libjvm.so`__1cIVMThreadEloop6M_v_+0x26c
> > libjvm.so`__1cIVMThreadDrun6M_v_+0x88
> > libjvm.so`java_start+0xf9
> > libc.so.1`_thr_setup+0x4e
> > libc.so.1`_lwp_start
> > 287
> >
> >> From safepointing point-of-view, the process isn't doing
> much useful work -- it
> > only got short periods of work interrupted by safepoints
> that take longer than
> > the actual application runtime -- at a rate of 1709
> safepoints per second (see
> > below):
> >
> > Application time: 0.0000450 seconds
> > Total time for which application threads were stopped:
> 0.0002397 seconds
> > Application time: 0.0001607 seconds
> > Total time for which application threads were stopped:
> 0.0002603 seconds
> > Application time: 0.0000469 seconds
> > Total time for which application threads were stopped:
> 0.0011530 seconds
> > ..
> >
> >> From the perfdata statistics, I see
> > sun.rt._sync_ContendedLockAttempts increasing by 12 / sec
> > sun.rt._sync_Deflations increasing by 477 / sec
> > sun.rt._sync_Inflations increasing by 477 / sec
> > sun.rt._sync_Notifications increasing by 791 / sec
> > sun.rt._sync_Parks increasing by 795 / sec
> > sun.rt.safepoints increasing by 1709 / sec
> > sun.rt.safepointTime is 0.82s and sun.rt.safepointSyncTime
> 0.15s per second.
> >
> > I guess we're using too much synchronization,
> Object.wait(ms) and notify() (or
> > even notifyAll()) ... ;-) but I'm wondering whether it was
> "normal" to have that
> > many safepoints and lock inflations/deflations because of that?
> >
> > My understanding was that safepoints are mainly caused by
> GC (not the case here)
> > or deoptimization (should be visible with +LogCompilation,
> right? but nothing
> > there either). Or bias revocation. Indeed, we do have some
> bias revocations
> > (like this, see below), but this is at a much lower rate
> (only ~ 3.4 times per
> > second):
>
> Hi Nick,
>
> Do you have test case that's easily packaged up?
>
> Inflation can be induced by a number of behaviors, such as
> wait(), contention, deoptimization, bias revocation, JNI
> monitor access, some hashCode actions, etc. By default,
> however, we don't trigger stop-the-world safepoints to
> deflate. Absent safepoints, the ultimate bound on the
> number of objectmonitors in play is just the number of live
> and dead objects in the heap. That is, there's no natural
> back-pressure on the number of monitors. If the number of
> monitors becomes too large the JVM can spend inordinate
> amounts of time in deflate_idle_monitors() scanning, trying
> to scavenge and deflate monitors, so there's both a time
> problem and space problem (arguably the space problem drives
> the time problem, but they manifest differently for customers).
>
> There's a day-one assumption in various pieces of code that
> the object:monitor relationship is stable except over
> potential safepoints. That simplifies many operations but
> is the source of the issue you're seeing. Ultimately, it's
> probably better to deflate on the fly and decouple safepoints
> from deflation. (The old ExactVM implementation did just
> that). It's relatively straight-forward, but touches a good
> deal of code. If you're curious I've left more detailed
> comments in synchronizer.cpp.
>
> In the mean time there are some palliative approaches under
> consideration to (a) induce STW safepoints to bound the
> number of extant monitors and in turn bound the cost of
> deflate_idle_monitors, and (b) partition the monitors so we
> only need scan those that were possibly inflated in the last
> STW epoch. We also have code percolating through the
> system that might cut down on some of the cases of inflation,
> which is why a test case would be of interest -- to see if
> the specific case you're running into might already be addressed.
>
> I vaguely recall adding a perfdata counter that reflects the
> number of objectmonitors in the system. Where there any
> other rt._sync_* variables visible in perfdata?
>
> Regards
> Dave
>
>
>
> >
> > <writer thread='10'/>
> > Revoking bias with potentially per-thread safepoint:
> > Revoking bias of object 0x6df887d0 , mark 0x08ee3805 , type
> > org.jacorb.poa.RequestQueue , prototype header 0x00000005 ,
> allow rebias 0 ,
> > requesting thread 0x09065800
> > Revoked bias of currently-unlocked object
> > <writer thread='149'/>
> > Revoking bias by walking my own stack:
> > Revoking bias of object 0x6df887c8 , mark 0x09065805 , type
> java.lang.Object ,
> > prototype header 0x00000005 , allow rebias 0 , requesting
> thread 0x09065800
> > Revoked bias of currently-locked object
> > Inflating object 0x6df887c8 , mark 0x0906042a , type
> java.lang.Object
> >
> > Also, disabling biased locking has no effect on the
> magnitude of safepointing,
> > so I concluded that safepointing must then be caused by
> lock inflation/deflation
> > (which seems to be supported by the stack above).
> >
> > Afaik a lock is being inflated when the leight-weight
> locking algorithm can't
> > acquire the lock immediately ("contention"). It will then
> fall back to
> > heavy-weight locking (OS mutex?) and notify (wake up) all
> waiting threads (?).
> > We have about 40 lock inflations per contended lock attempt
> (see the perfdata
> > statistics above), does that mean there were (on average)
> 40 threads waiting on
> > that contended lock?
> >
> > Since we have the same amount of deflations as we have
> inflations, I would
> > assume that each heavy-weight lock is shortly after being
> "contended" reverted
> > to a leight-weight lock again (during a safepoint in
> deflate_idle_monitors() as
> > in the stack above??). But why? Couldn't it stay as a
> heavy-weight lock, instead
> > of being deflated, only to be inflated shortly after again?
> >
> > In the example below, lock 0x96568a60 is being inflated,
> deflated, inflated
> > again and then deflated again:
> >
> > <writer thread='143'/>
> > Inflating object 0x96568a60 , mark 0x08677ffe , type
> org.jacorb.poa.RequestProcessor
> > <writer thread='10'/>
> > Deflating object 0x96568a60 , mark 0x08677ffe , type
> org.jacorb.poa.RequestProcessor
> > <writer thread='143'/>
> > Inflating object 0x96568a60 , mark 0x0918f4fe , type
> org.jacorb.poa.RequestProcessor
> > <writer thread='240'/>
> > Inflating object 0x79bceee8 , mark 0x08c0b662 , type
> java.lang.Object
> > <writer thread='243'/>
> > Inflating object 0x79bceeb0 , mark 0x0914e9ee , type
> org.jacorb.orb.ReplyReceiver
> > <writer thread='10'/>
> > Deflating object 0x79bceee8 , mark 0x08c0b662 , type
> java.lang.Object
> > Deflating object 0x79bceeb0 , mark 0x0914e9ee , type
> org.jacorb.orb.ReplyReceiver
> > Deflating object 0x96568a60 , mark 0x0918f4fe , type
> org.jacorb.poa.RequestProcessor
> >
> > I was just curious whether this is the intended behavior in
> case of lock
> > contention, or whether we're seing some pathological
> strange thing happen here.
> >
> > Btw, I've also ran the workload with Hotspot 19.0-b03 from
> the latest JDK 7
> > build with the same results.
> >
> > Any comments appreciated!
> >
> > Thanks,
> > Nick.
> >
>
>
More information about the hotspot-runtime-dev
mailing list