monitors explosion
Karen Kinnear
karen.kinnear at oracle.com
Wed May 25 16:50:36 PDT 2011
Peter,
Yes, I've been running your test case with some existing flags we have
to help with this kind of problem. Thank you for the test case - that
really
helps.
There is a flag -XX:MonitorBound=<int> which can limit the number of
monitors that a thread can hang on to. For the test case I randomly
tried -XX:MonitorBound=80 and it reduced the final deflating idel
monitors
reported time from
0.624620 secs to
0.0007840 secs
The number can be tuned for your specific application. If you set it
lower, it reduces the number of monitors tied-up, and therefore
reduces the amount of time spent deflating idle monitors. The trade-off
is that it increases the frequency of GC in order to free the monitors.
So you have to tune the number to your specific application needs,
so as to minimize overall pause time. A recommended place to start would
be with # threads * # locks per thread at a steady state, where # locks
is actually number of allocated monitors, which reflects contended locks
and locks for which you use wait/notify.
The flag is available in JDK7. I believe it was also backported to
JDK6u19.
hope this helps,
Karen
On May 25, 2011, at 5:01 PM, Y. Srinivas Ramakrishna wrote:
> Hi Peter --
>
> Which version of the JDK elicits this behaviour? The monitor leakage
> sounds vaguely like a problem that i encountered before and which
> was fixed a while back, but this may well be a new problem,
> and i can't dig up the CR for the problem that I recall in
> this code, or where the fix went to. (Karen might recall
> the details because she pushed the fix/workaround for that one.)
>
> -- ramki
>
> On 5/25/2011 7:21 AM, Peter Levart wrote:
>> Hello hotspot developers,
>>
>> I'm posting here because I think this might interest you.
>>
>> Here's a summary:
>>
>> Our application has a boot-up sequence that spawns about 80 threads
>> at the beginning where each of them executes a long running query
>> into the database that preloads caches. We are using Oracle
>> Coherence for that. At some point in this boot-up sequence where
>> all 80 threads are busy inserting data into cache (each thread
>> inserts into it's own cache instance - basically a
>> ConcurrentHashMap with features that allow replication to oher
>> nodes, etc...), the following happens to "deflate idle monitors"
>> time (-XX:+TraceSafepointCleanupTime):
>>
>> [deflating idle monitors, 0.0004058 secs]
>> [deflating idle monitors, 0.0003921 secs]
>> [deflating idle monitors, 0.0005102 secs]
>> [deflating idle monitors, 0.0011647 secs]
>> [deflating idle monitors, 0.0004439 secs]
>> [deflating idle monitors, 0.0003865 secs]
>> [deflating idle monitors, 0.0003829 secs]
>> [deflating idle monitors, 0.0003838 secs]
>> [deflating idle monitors, 0.0003962 secs]
>> [deflating idle monitors, 0.0003758 secs]
>> [deflating idle monitors, 0.0003691 secs]
>> [deflating idle monitors, 0.0030027 secs]
>> [deflating idle monitors, 0.0032613 secs]
>> [deflating idle monitors, 0.0095863 secs]
>> [deflating idle monitors, 0.0116110 secs]
>> [deflating idle monitors, 0.1404215 secs]
>> [deflating idle monitors, 0.1260821 secs]
>> [deflating idle monitors, 0.1520614 secs]
>> [deflating idle monitors, 0.2608265 secs]
>> [deflating idle monitors, 0.2195979 secs]
>> [deflating idle monitors, 0.4246853 secs]
>> [deflating idle monitors, 0.6519136 secs]
>> [deflating idle monitors, 0.5262590 secs]
>> [deflating idle monitors, 0.4404068 secs]
>> [deflating idle monitors, 0.4404227 secs]
>> [deflating idle monitors, 0.4404056 secs]
>> [deflating idle monitors, 0.4403642 secs]
>> [deflating idle monitors, 0.5276827 secs]
>> [deflating idle monitors, 0.4405319 secs]
>> [deflating idle monitors, 0.4406389 secs]
>> [deflating idle monitors, 0.4403989 secs]
>> [deflating idle monitors, 0.4403940 secs]
>> [deflating idle monitors, 0.4404595 secs]
>> [deflating idle monitors, 0.4403539 secs]
>> ...
>> ... and stabilizes at the end of boot-up sequence to:
>> ...
>> [deflating idle monitors, 0.7225519 secs]
>> [deflating idle monitors, 0.7225144 secs]
>> [deflating idle monitors, 0.7225794 secs]
>> [deflating idle monitors, 0.7225868 secs]
>> [deflating idle monitors, 0.7225747 secs]
>> [deflating idle monitors, 0.7225810 secs]
>> [deflating idle monitors, 0.7227106 secs]
>> [deflating idle monitors, 0.7227789 secs]
>> [deflating idle monitors, 0.7226465 secs]
>> [deflating idle monitors, 0.7226106 secs]
>>
>> At the end of boot-up the system is idle and safepoints are not
>> observed. If I then issue a HTTP request on the embeded Jetty
>> container that executes some platform JNI operations (Sockets/NIO/
>> File IO) it triggers 4 safepoints each taking 0.7 sec and request
>> completes in roughly 3 secs.
>>
>> jstat -J-Djstat.showUnsupported=true -snap 27205 | grep sun.rt._sync_
>>
>> then prints the following:
>>
>> sun.rt._sync_ContendedLockAttempts=40303
>> sun.rt._sync_Deflations=19882562
>> sun.rt._sync_EmptyNotifications=0
>> sun.rt._sync_FailedSpins=0
>> sun.rt._sync_FutileWakeups=1912
>> sun.rt._sync_Inflations=19859801
>> sun.rt._sync_MonExtant=2166144
>> sun.rt._sync_MonInCirculation=0
>> sun.rt._sync_MonScavenged=0
>> sun.rt._sync_Notifications=23310881
>> sun.rt._sync_Parks=23333919
>> sun.rt._sync_PrivateA=0
>> sun.rt._sync_PrivateB=0
>> sun.rt._sync_SlowEnter=0
>> sun.rt._sync_SlowExit=0
>> sun.rt._sync_SlowNotify=0
>> sun.rt._sync_SlowNotifyAll=0
>> sun.rt._sync_SuccessfulSpins=0
>>
>>
>> ...the same boot-up sequence observed via the 2s samples of
>> _sync_MonExtant variable:
>>
>>
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=768
>> sun.rt._sync_MonExtant=768
>> sun.rt._sync_MonExtant=768
>> sun.rt._sync_MonExtant=768
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=6144
>> sun.rt._sync_MonExtant=6400
>> sun.rt._sync_MonExtant=19456
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> ...
>> ...
>> sun.rt._sync_MonExtant=2166144
>>
>>
>> It seems that somehow monitors accumulate on some thread's private
>> free lists and those threads don't die so omFlush is never called
>> for them (they live happily for ever after in a thread pool).
>>
>> We spawn those 80 inserting threads via FixedThreadPoolExecutor and
>> after all tasks are finished we shutdown the executor, so those 80
>> threads are gone, but...
>>
>> ...is it possible for a thread to call omRelease for a monitor that
>> was not omAlloc-ated by this same thread? In this case object
>> monitors, allocated by one thread would get accumulated on a free
>> list of another thread and if this thread is a long-running thread
>> those monitors would stay there forever...
>>
>> Here's a test program that on my machine (i3 CPU, Linux)
>> accumulates arround 1.6M monitors:
>>
>>
>> package monitors;
>>
>> import java.util.concurrent.BlockingQueue;
>> import java.util.concurrent.SynchronousQueue;
>>
>> /**
>> * @author Peter Levart
>> * @created 5/25/11 @ 3:35 PM
>> */
>> public class MonitorsExplosion
>> {
>> BlockingQueue<Object> queue = new SynchronousQueue<Object>();
>>
>> class Allocator extends Thread
>> {
>> final int allocations;
>>
>> Allocator(int allocations)
>> {
>> this.allocations = allocations;
>> }
>>
>> @Override
>> public void run()
>> {
>> for (int i = 0; i< allocations; i++)
>> {
>> Object o = new Object();
>> synchronized (o)
>> {
>> try
>> {
>> queue.put(o);
>> Thread.yield();
>> }
>> catch (InterruptedException e)
>> {
>> // ignore
>> }
>> }
>> }
>> }
>> }
>>
>> class Collector extends Thread
>> {
>> @Override
>> public void run()
>> {
>> int hashSum = 0;
>>
>> for (; ; )
>> {
>> try
>> {
>> Object o = queue.take();
>> synchronized (o)
>> {
>> hashSum += o.hashCode();
>> }
>> }
>> catch (InterruptedException e)
>> {
>> // ignore
>> }
>> }
>> }
>> }
>>
>> void detonate() throws InterruptedException
>> {
>> Allocator[] allocators = new Allocator[8];
>> for (int i = 0; i< allocators.length; i++)
>> {
>> Allocator allocator = new Allocator(1000000);
>> allocator.start();
>> allocators[i] = allocator;
>> }
>>
>> System.out.println("Allocators started");
>>
>> Collector collector = new Collector();
>> collector.start();
>>
>> System.out.println("Collector started");
>>
>> for (Allocator allocator : allocators)
>> allocator.join();
>>
>> System.out.println("Allocators finished");
>>
>> collector.join(); // never ending...
>>
>> System.out.println("Collector finished");
>> }
>>
>> public static void main(String[] args) throws
>> InterruptedException
>> {
>> new MonitorsExplosion().detonate();
>> }
>> }
>>
>>
>> ...I don't know if they get returned to global free list or if they
>> just lie there on Collectors private list though.
>>
>>
>> Regards,
>>
>> Peter Levart
>
More information about the hotspot-runtime-dev
mailing list