monitors explosion
Y. Srinivas Ramakrishna
y.s.ramakrishna at oracle.com
Wed May 25 15:02:50 PDT 2011
This is the CR that i was thinking of:-
6852873 Increase in delta between application stopped time and ParNew GC time over application lifetime
I can't tell much on how or if the fix/workaround got
into specific releases, but i am guessing it must be in
the latest 6uXX public release out there (6u25?).
Also relevant might be this CR:
6988353 refactor sync subsystem
By the way, that was such a nice bug report you filed
(we just need the JVM version info from you that's all);
i wish all of our customers were like you. Thanks!
-- ramki
On 5/25/2011 2:01 PM, Y. Srinivas Ramakrishna wrote:
> Hi Peter --
>
> Which version of the JDK elicits this behaviour? The monitor leakage
> sounds vaguely like a problem that i encountered before and which
> was fixed a while back, but this may well be a new problem,
> and i can't dig up the CR for the problem that I recall in
> this code, or where the fix went to. (Karen might recall
> the details because she pushed the fix/workaround for that one.)
>
> -- ramki
>
> On 5/25/2011 7:21 AM, Peter Levart wrote:
>> Hello hotspot developers,
>>
>> I'm posting here because I think this might interest you.
>>
>> Here's a summary:
>>
>> Our application has a boot-up sequence that spawns about 80 threads at the beginning where each of
>> them executes a long running query into the database that preloads caches. We are using Oracle
>> Coherence for that. At some point in this boot-up sequence where all 80 threads are busy inserting
>> data into cache (each thread inserts into it's own cache instance - basically a ConcurrentHashMap
>> with features that allow replication to oher nodes, etc...), the following happens to "deflate
>> idle monitors" time (-XX:+TraceSafepointCleanupTime):
>>
>> [deflating idle monitors, 0.0004058 secs]
>> [deflating idle monitors, 0.0003921 secs]
>> [deflating idle monitors, 0.0005102 secs]
>> [deflating idle monitors, 0.0011647 secs]
>> [deflating idle monitors, 0.0004439 secs]
>> [deflating idle monitors, 0.0003865 secs]
>> [deflating idle monitors, 0.0003829 secs]
>> [deflating idle monitors, 0.0003838 secs]
>> [deflating idle monitors, 0.0003962 secs]
>> [deflating idle monitors, 0.0003758 secs]
>> [deflating idle monitors, 0.0003691 secs]
>> [deflating idle monitors, 0.0030027 secs]
>> [deflating idle monitors, 0.0032613 secs]
>> [deflating idle monitors, 0.0095863 secs]
>> [deflating idle monitors, 0.0116110 secs]
>> [deflating idle monitors, 0.1404215 secs]
>> [deflating idle monitors, 0.1260821 secs]
>> [deflating idle monitors, 0.1520614 secs]
>> [deflating idle monitors, 0.2608265 secs]
>> [deflating idle monitors, 0.2195979 secs]
>> [deflating idle monitors, 0.4246853 secs]
>> [deflating idle monitors, 0.6519136 secs]
>> [deflating idle monitors, 0.5262590 secs]
>> [deflating idle monitors, 0.4404068 secs]
>> [deflating idle monitors, 0.4404227 secs]
>> [deflating idle monitors, 0.4404056 secs]
>> [deflating idle monitors, 0.4403642 secs]
>> [deflating idle monitors, 0.5276827 secs]
>> [deflating idle monitors, 0.4405319 secs]
>> [deflating idle monitors, 0.4406389 secs]
>> [deflating idle monitors, 0.4403989 secs]
>> [deflating idle monitors, 0.4403940 secs]
>> [deflating idle monitors, 0.4404595 secs]
>> [deflating idle monitors, 0.4403539 secs]
>> ...
>> ... and stabilizes at the end of boot-up sequence to:
>> ...
>> [deflating idle monitors, 0.7225519 secs]
>> [deflating idle monitors, 0.7225144 secs]
>> [deflating idle monitors, 0.7225794 secs]
>> [deflating idle monitors, 0.7225868 secs]
>> [deflating idle monitors, 0.7225747 secs]
>> [deflating idle monitors, 0.7225810 secs]
>> [deflating idle monitors, 0.7227106 secs]
>> [deflating idle monitors, 0.7227789 secs]
>> [deflating idle monitors, 0.7226465 secs]
>> [deflating idle monitors, 0.7226106 secs]
>>
>> At the end of boot-up the system is idle and safepoints are not observed. If I then issue a HTTP
>> request on the embeded Jetty container that executes some platform JNI operations
>> (Sockets/NIO/File IO) it triggers 4 safepoints each taking 0.7 sec and request completes in
>> roughly 3 secs.
>>
>> jstat -J-Djstat.showUnsupported=true -snap 27205 | grep sun.rt._sync_
>>
>> then prints the following:
>>
>> sun.rt._sync_ContendedLockAttempts=40303
>> sun.rt._sync_Deflations=19882562
>> sun.rt._sync_EmptyNotifications=0
>> sun.rt._sync_FailedSpins=0
>> sun.rt._sync_FutileWakeups=1912
>> sun.rt._sync_Inflations=19859801
>> sun.rt._sync_MonExtant=2166144
>> sun.rt._sync_MonInCirculation=0
>> sun.rt._sync_MonScavenged=0
>> sun.rt._sync_Notifications=23310881
>> sun.rt._sync_Parks=23333919
>> sun.rt._sync_PrivateA=0
>> sun.rt._sync_PrivateB=0
>> sun.rt._sync_SlowEnter=0
>> sun.rt._sync_SlowExit=0
>> sun.rt._sync_SlowNotify=0
>> sun.rt._sync_SlowNotifyAll=0
>> sun.rt._sync_SuccessfulSpins=0
>>
>>
>> ...the same boot-up sequence observed via the 2s samples of _sync_MonExtant variable:
>>
>>
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=640
>> sun.rt._sync_MonExtant=768
>> sun.rt._sync_MonExtant=768
>> sun.rt._sync_MonExtant=768
>> sun.rt._sync_MonExtant=768
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=3200
>> sun.rt._sync_MonExtant=6144
>> sun.rt._sync_MonExtant=6400
>> sun.rt._sync_MonExtant=19456
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=21632
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=286080
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=308608
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=546432
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=901760
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1188736
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1572480
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> sun.rt._sync_MonExtant=1770112
>> ...
>> ...
>> sun.rt._sync_MonExtant=2166144
>>
>>
>> It seems that somehow monitors accumulate on some thread's private free lists and those threads
>> don't die so omFlush is never called for them (they live happily for ever after in a thread pool).
>>
>> We spawn those 80 inserting threads via FixedThreadPoolExecutor and after all tasks are finished
>> we shutdown the executor, so those 80 threads are gone, but...
>>
>> ...is it possible for a thread to call omRelease for a monitor that was not omAlloc-ated by this
>> same thread? In this case object monitors, allocated by one thread would get accumulated on a free
>> list of another thread and if this thread is a long-running thread those monitors would stay there
>> forever...
>>
>> Here's a test program that on my machine (i3 CPU, Linux) accumulates arround 1.6M monitors:
>>
>>
>> package monitors;
>>
>> import java.util.concurrent.BlockingQueue;
>> import java.util.concurrent.SynchronousQueue;
>>
>> /**
>> * @author Peter Levart
>> * @created 5/25/11 @ 3:35 PM
>> */
>> public class MonitorsExplosion
>> {
>> BlockingQueue<Object> queue = new SynchronousQueue<Object>();
>>
>> class Allocator extends Thread
>> {
>> final int allocations;
>>
>> Allocator(int allocations)
>> {
>> this.allocations = allocations;
>> }
>>
>> @Override
>> public void run()
>> {
>> for (int i = 0; i< allocations; i++)
>> {
>> Object o = new Object();
>> synchronized (o)
>> {
>> try
>> {
>> queue.put(o);
>> Thread.yield();
>> }
>> catch (InterruptedException e)
>> {
>> // ignore
>> }
>> }
>> }
>> }
>> }
>>
>> class Collector extends Thread
>> {
>> @Override
>> public void run()
>> {
>> int hashSum = 0;
>>
>> for (; ; )
>> {
>> try
>> {
>> Object o = queue.take();
>> synchronized (o)
>> {
>> hashSum += o.hashCode();
>> }
>> }
>> catch (InterruptedException e)
>> {
>> // ignore
>> }
>> }
>> }
>> }
>>
>> void detonate() throws InterruptedException
>> {
>> Allocator[] allocators = new Allocator[8];
>> for (int i = 0; i< allocators.length; i++)
>> {
>> Allocator allocator = new Allocator(1000000);
>> allocator.start();
>> allocators[i] = allocator;
>> }
>>
>> System.out.println("Allocators started");
>>
>> Collector collector = new Collector();
>> collector.start();
>>
>> System.out.println("Collector started");
>>
>> for (Allocator allocator : allocators)
>> allocator.join();
>>
>> System.out.println("Allocators finished");
>>
>> collector.join(); // never ending...
>>
>> System.out.println("Collector finished");
>> }
>>
>> public static void main(String[] args) throws InterruptedException
>> {
>> new MonitorsExplosion().detonate();
>> }
>> }
>>
>>
>> ...I don't know if they get returned to global free list or if they just lie there on Collectors
>> private list though.
>>
>>
>> Regards,
>>
>> Peter Levart
>
More information about the hotspot-runtime-dev
mailing list