monitors explosion

David Holmes David.Holmes at oracle.com
Wed May 25 18:18:42 PDT 2011


Is there some specific reason that taking the hashCode of a locked 
object forces inflation here? We may need to look at why inflation is 
happening and whether there is something the application and/or 
libraries can do to mitigate it. (As well as considering what happens in 
the VM of course).

David

Karen Kinnear said the following on 05/26/11 09:50:
> Peter,
> 
> Yes, I've been running your test case with some existing flags we have
> to help with this kind of problem. Thank you for the test case - that 
> really
> helps.
> 
> There is a flag -XX:MonitorBound=<int> which can limit the number of
> monitors that a thread can hang on to. For the test case I randomly
> tried -XX:MonitorBound=80 and it reduced the final deflating idel monitors
> reported time from
> 0.624620 secs to
> 0.0007840 secs
> 
> The number can be tuned for your specific application. If you set it
> lower, it reduces the number of monitors tied-up, and therefore
> reduces the amount of time spent deflating idle monitors. The trade-off
> is that it increases the frequency of GC in order to free the monitors.
> So you have to tune the number to your specific application needs,
> so as to minimize overall pause time. A recommended place to start would
> be with # threads * # locks per thread at a steady state, where # locks
> is actually number of allocated monitors, which reflects contended locks
> and locks for which you use wait/notify.
> 
> The flag is available in JDK7. I believe it was also backported to JDK6u19.
> 
> hope this helps,
> Karen
> 
> On May 25, 2011, at 5:01 PM, Y. Srinivas Ramakrishna wrote:
> 
>> Hi Peter --
>>
>> Which version of the JDK elicits this behaviour? The monitor leakage
>> sounds vaguely like a problem that i encountered before and which
>> was fixed a while back, but this may well be a new problem,
>> and i can't dig up the CR for the problem that I recall in
>> this code, or where the fix went to. (Karen might recall
>> the details because she pushed the fix/workaround for that one.)
>>
>> -- ramki
>>
>> On 5/25/2011 7:21 AM, Peter Levart wrote:
>>> Hello hotspot developers,
>>>
>>> I'm posting here because I think this might interest you.
>>>
>>> Here's a summary:
>>>
>>> Our application has a boot-up sequence that spawns about 80 threads 
>>> at the beginning where each of them executes a long running query 
>>> into the database that preloads caches. We are using Oracle Coherence 
>>> for that. At some point in this boot-up sequence where all 80 threads 
>>> are busy inserting data into cache (each thread inserts into it's own 
>>> cache instance - basically a ConcurrentHashMap with features that 
>>> allow replication to oher nodes, etc...), the following happens to 
>>> "deflate idle monitors" time (-XX:+TraceSafepointCleanupTime):
>>>
>>> [deflating idle monitors, 0.0004058 secs]
>>> [deflating idle monitors, 0.0003921 secs]
>>> [deflating idle monitors, 0.0005102 secs]
>>> [deflating idle monitors, 0.0011647 secs]
>>> [deflating idle monitors, 0.0004439 secs]
>>> [deflating idle monitors, 0.0003865 secs]
>>> [deflating idle monitors, 0.0003829 secs]
>>> [deflating idle monitors, 0.0003838 secs]
>>> [deflating idle monitors, 0.0003962 secs]
>>> [deflating idle monitors, 0.0003758 secs]
>>> [deflating idle monitors, 0.0003691 secs]
>>> [deflating idle monitors, 0.0030027 secs]
>>> [deflating idle monitors, 0.0032613 secs]
>>> [deflating idle monitors, 0.0095863 secs]
>>> [deflating idle monitors, 0.0116110 secs]
>>> [deflating idle monitors, 0.1404215 secs]
>>> [deflating idle monitors, 0.1260821 secs]
>>> [deflating idle monitors, 0.1520614 secs]
>>> [deflating idle monitors, 0.2608265 secs]
>>> [deflating idle monitors, 0.2195979 secs]
>>> [deflating idle monitors, 0.4246853 secs]
>>> [deflating idle monitors, 0.6519136 secs]
>>> [deflating idle monitors, 0.5262590 secs]
>>> [deflating idle monitors, 0.4404068 secs]
>>> [deflating idle monitors, 0.4404227 secs]
>>> [deflating idle monitors, 0.4404056 secs]
>>> [deflating idle monitors, 0.4403642 secs]
>>> [deflating idle monitors, 0.5276827 secs]
>>> [deflating idle monitors, 0.4405319 secs]
>>> [deflating idle monitors, 0.4406389 secs]
>>> [deflating idle monitors, 0.4403989 secs]
>>> [deflating idle monitors, 0.4403940 secs]
>>> [deflating idle monitors, 0.4404595 secs]
>>> [deflating idle monitors, 0.4403539 secs]
>>> ...
>>> ... and stabilizes at the end of boot-up sequence to:
>>> ...
>>> [deflating idle monitors, 0.7225519 secs]
>>> [deflating idle monitors, 0.7225144 secs]
>>> [deflating idle monitors, 0.7225794 secs]
>>> [deflating idle monitors, 0.7225868 secs]
>>> [deflating idle monitors, 0.7225747 secs]
>>> [deflating idle monitors, 0.7225810 secs]
>>> [deflating idle monitors, 0.7227106 secs]
>>> [deflating idle monitors, 0.7227789 secs]
>>> [deflating idle monitors, 0.7226465 secs]
>>> [deflating idle monitors, 0.7226106 secs]
>>>
>>> At the end of boot-up the system is idle and safepoints are not 
>>> observed. If I then issue a HTTP request on the embeded Jetty 
>>> container that executes some platform JNI operations 
>>> (Sockets/NIO/File IO) it triggers 4 safepoints each taking 0.7 sec 
>>> and request completes in roughly 3 secs.
>>>
>>> jstat -J-Djstat.showUnsupported=true -snap 27205 | grep sun.rt._sync_
>>>
>>> then prints the following:
>>>
>>> sun.rt._sync_ContendedLockAttempts=40303
>>> sun.rt._sync_Deflations=19882562
>>> sun.rt._sync_EmptyNotifications=0
>>> sun.rt._sync_FailedSpins=0
>>> sun.rt._sync_FutileWakeups=1912
>>> sun.rt._sync_Inflations=19859801
>>> sun.rt._sync_MonExtant=2166144
>>> sun.rt._sync_MonInCirculation=0
>>> sun.rt._sync_MonScavenged=0
>>> sun.rt._sync_Notifications=23310881
>>> sun.rt._sync_Parks=23333919
>>> sun.rt._sync_PrivateA=0
>>> sun.rt._sync_PrivateB=0
>>> sun.rt._sync_SlowEnter=0
>>> sun.rt._sync_SlowExit=0
>>> sun.rt._sync_SlowNotify=0
>>> sun.rt._sync_SlowNotifyAll=0
>>> sun.rt._sync_SuccessfulSpins=0
>>>
>>>
>>> ...the same boot-up sequence observed via the 2s samples of 
>>> _sync_MonExtant variable:
>>>
>>>
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=640
>>> sun.rt._sync_MonExtant=768
>>> sun.rt._sync_MonExtant=768
>>> sun.rt._sync_MonExtant=768
>>> sun.rt._sync_MonExtant=768
>>> sun.rt._sync_MonExtant=3200
>>> sun.rt._sync_MonExtant=3200
>>> sun.rt._sync_MonExtant=3200
>>> sun.rt._sync_MonExtant=3200
>>> sun.rt._sync_MonExtant=3200
>>> sun.rt._sync_MonExtant=3200
>>> sun.rt._sync_MonExtant=6144
>>> sun.rt._sync_MonExtant=6400
>>> sun.rt._sync_MonExtant=19456
>>> sun.rt._sync_MonExtant=21632
>>> sun.rt._sync_MonExtant=21632
>>> sun.rt._sync_MonExtant=21632
>>> sun.rt._sync_MonExtant=21632
>>> sun.rt._sync_MonExtant=21632
>>> sun.rt._sync_MonExtant=21632
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=286080
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=308608
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=546432
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=901760
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1188736
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1572480
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> sun.rt._sync_MonExtant=1770112
>>> ...
>>> ...
>>> sun.rt._sync_MonExtant=2166144
>>>
>>>
>>> It seems that somehow monitors accumulate on some thread's private 
>>> free lists and those threads don't die so omFlush is never called for 
>>> them (they live happily for ever after in a thread pool).
>>>
>>> We spawn those 80 inserting threads via FixedThreadPoolExecutor and 
>>> after all tasks are finished we shutdown the executor, so those 80 
>>> threads are gone, but...
>>>
>>> ...is it possible for a thread to call omRelease for a monitor that 
>>> was not omAlloc-ated by this same thread? In this case object 
>>> monitors, allocated by one thread would get accumulated on a free 
>>> list of another thread and if this thread is a long-running thread 
>>> those monitors would stay there forever...
>>>
>>> Here's a test program that on my machine (i3 CPU, Linux) accumulates 
>>> arround 1.6M monitors:
>>>
>>>
>>> package monitors;
>>>
>>> import java.util.concurrent.BlockingQueue;
>>> import java.util.concurrent.SynchronousQueue;
>>>
>>> /**
>>>  * @author Peter Levart
>>>  * @created 5/25/11 @ 3:35 PM
>>>  */
>>> public class MonitorsExplosion
>>> {
>>>     BlockingQueue<Object>  queue = new SynchronousQueue<Object>();
>>>
>>>     class Allocator extends Thread
>>>     {
>>>         final int allocations;
>>>
>>>         Allocator(int allocations)
>>>         {
>>>             this.allocations = allocations;
>>>         }
>>>
>>>         @Override
>>>         public void run()
>>>         {
>>>             for (int i = 0; i<  allocations; i++)
>>>             {
>>>                 Object o = new Object();
>>>                 synchronized (o)
>>>                 {
>>>                     try
>>>                     {
>>>                         queue.put(o);
>>>                         Thread.yield();
>>>                     }
>>>                     catch (InterruptedException e)
>>>                     {
>>>                         // ignore
>>>                     }
>>>                 }
>>>             }
>>>         }
>>>     }
>>>
>>>     class Collector extends Thread
>>>     {
>>>         @Override
>>>         public void run()
>>>         {
>>>             int hashSum = 0;
>>>
>>>             for (; ; )
>>>             {
>>>                 try
>>>                 {
>>>                     Object o = queue.take();
>>>                     synchronized (o)
>>>                     {
>>>                         hashSum += o.hashCode();
>>>                     }
>>>                 }
>>>                 catch (InterruptedException e)
>>>                 {
>>>                     // ignore
>>>                 }
>>>             }
>>>         }
>>>     }
>>>
>>>     void detonate() throws InterruptedException
>>>     {
>>>         Allocator[] allocators = new Allocator[8];
>>>         for (int i = 0; i<  allocators.length; i++)
>>>         {
>>>             Allocator allocator = new Allocator(1000000);
>>>             allocator.start();
>>>             allocators[i] = allocator;
>>>         }
>>>
>>>         System.out.println("Allocators started");
>>>
>>>         Collector collector = new Collector();
>>>         collector.start();
>>>
>>>         System.out.println("Collector started");
>>>
>>>         for (Allocator allocator : allocators)
>>>             allocator.join();
>>>
>>>         System.out.println("Allocators finished");
>>>
>>>         collector.join(); // never ending...
>>>
>>>         System.out.println("Collector finished");
>>>     }
>>>
>>>     public static void main(String[] args) throws InterruptedException
>>>     {
>>>         new MonitorsExplosion().detonate();
>>>     }
>>> }
>>>
>>>
>>> ...I don't know if they get returned to global free list or if they 
>>> just lie there on Collectors private list though.
>>>
>>>
>>> Regards,
>>>
>>> Peter Levart
>>
> 


More information about the hotspot-runtime-dev mailing list