monitors explosion

Y. Srinivas Ramakrishna y.s.ramakrishna at oracle.com
Wed May 25 14:01:24 PDT 2011


Hi Peter --

Which version of the JDK elicits this behaviour? The monitor leakage
sounds vaguely like a problem that i encountered before and which
was fixed a while back, but this may well be a new problem,
and i can't dig up the CR for the problem that I recall in
this code, or where the fix went to. (Karen might recall
the details because she pushed the fix/workaround for that one.)

-- ramki

On 5/25/2011 7:21 AM, Peter Levart wrote:
> Hello hotspot developers,
>
> I'm posting here because I think this might interest you.
>
> Here's a summary:
>
> Our application has a boot-up sequence that spawns about 80 threads at the beginning where each of them executes a long running query into the database that preloads caches. We are using Oracle Coherence for that. At some point in this boot-up sequence where all 80 threads are busy inserting data into cache (each thread inserts into it's own cache instance - basically a ConcurrentHashMap with features that allow replication to oher nodes, etc...), the following happens to "deflate idle monitors" time (-XX:+TraceSafepointCleanupTime):
>
> [deflating idle monitors, 0.0004058 secs]
> [deflating idle monitors, 0.0003921 secs]
> [deflating idle monitors, 0.0005102 secs]
> [deflating idle monitors, 0.0011647 secs]
> [deflating idle monitors, 0.0004439 secs]
> [deflating idle monitors, 0.0003865 secs]
> [deflating idle monitors, 0.0003829 secs]
> [deflating idle monitors, 0.0003838 secs]
> [deflating idle monitors, 0.0003962 secs]
> [deflating idle monitors, 0.0003758 secs]
> [deflating idle monitors, 0.0003691 secs]
> [deflating idle monitors, 0.0030027 secs]
> [deflating idle monitors, 0.0032613 secs]
> [deflating idle monitors, 0.0095863 secs]
> [deflating idle monitors, 0.0116110 secs]
> [deflating idle monitors, 0.1404215 secs]
> [deflating idle monitors, 0.1260821 secs]
> [deflating idle monitors, 0.1520614 secs]
> [deflating idle monitors, 0.2608265 secs]
> [deflating idle monitors, 0.2195979 secs]
> [deflating idle monitors, 0.4246853 secs]
> [deflating idle monitors, 0.6519136 secs]
> [deflating idle monitors, 0.5262590 secs]
> [deflating idle monitors, 0.4404068 secs]
> [deflating idle monitors, 0.4404227 secs]
> [deflating idle monitors, 0.4404056 secs]
> [deflating idle monitors, 0.4403642 secs]
> [deflating idle monitors, 0.5276827 secs]
> [deflating idle monitors, 0.4405319 secs]
> [deflating idle monitors, 0.4406389 secs]
> [deflating idle monitors, 0.4403989 secs]
> [deflating idle monitors, 0.4403940 secs]
> [deflating idle monitors, 0.4404595 secs]
> [deflating idle monitors, 0.4403539 secs]
> ...
> ... and stabilizes at the end of boot-up sequence to:
> ...
> [deflating idle monitors, 0.7225519 secs]
> [deflating idle monitors, 0.7225144 secs]
> [deflating idle monitors, 0.7225794 secs]
> [deflating idle monitors, 0.7225868 secs]
> [deflating idle monitors, 0.7225747 secs]
> [deflating idle monitors, 0.7225810 secs]
> [deflating idle monitors, 0.7227106 secs]
> [deflating idle monitors, 0.7227789 secs]
> [deflating idle monitors, 0.7226465 secs]
> [deflating idle monitors, 0.7226106 secs]
>
> At the end of boot-up the system is idle and safepoints are not observed. If I then issue a HTTP request on the embeded Jetty container that executes some platform JNI operations (Sockets/NIO/File IO) it triggers 4 safepoints each taking 0.7 sec and request completes in roughly 3 secs.
>
> jstat -J-Djstat.showUnsupported=true -snap 27205 | grep sun.rt._sync_
>
> then prints the following:
>
> sun.rt._sync_ContendedLockAttempts=40303
> sun.rt._sync_Deflations=19882562
> sun.rt._sync_EmptyNotifications=0
> sun.rt._sync_FailedSpins=0
> sun.rt._sync_FutileWakeups=1912
> sun.rt._sync_Inflations=19859801
> sun.rt._sync_MonExtant=2166144
> sun.rt._sync_MonInCirculation=0
> sun.rt._sync_MonScavenged=0
> sun.rt._sync_Notifications=23310881
> sun.rt._sync_Parks=23333919
> sun.rt._sync_PrivateA=0
> sun.rt._sync_PrivateB=0
> sun.rt._sync_SlowEnter=0
> sun.rt._sync_SlowExit=0
> sun.rt._sync_SlowNotify=0
> sun.rt._sync_SlowNotifyAll=0
> sun.rt._sync_SuccessfulSpins=0
>
>
> ...the same boot-up sequence observed via the 2s samples of _sync_MonExtant variable:
>
>
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=640
> sun.rt._sync_MonExtant=768
> sun.rt._sync_MonExtant=768
> sun.rt._sync_MonExtant=768
> sun.rt._sync_MonExtant=768
> sun.rt._sync_MonExtant=3200
> sun.rt._sync_MonExtant=3200
> sun.rt._sync_MonExtant=3200
> sun.rt._sync_MonExtant=3200
> sun.rt._sync_MonExtant=3200
> sun.rt._sync_MonExtant=3200
> sun.rt._sync_MonExtant=6144
> sun.rt._sync_MonExtant=6400
> sun.rt._sync_MonExtant=19456
> sun.rt._sync_MonExtant=21632
> sun.rt._sync_MonExtant=21632
> sun.rt._sync_MonExtant=21632
> sun.rt._sync_MonExtant=21632
> sun.rt._sync_MonExtant=21632
> sun.rt._sync_MonExtant=21632
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=286080
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=308608
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=546432
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=901760
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1188736
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1572480
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> sun.rt._sync_MonExtant=1770112
> ...
> ...
> sun.rt._sync_MonExtant=2166144
>
>
> It seems that somehow monitors accumulate on some thread's private free lists and those threads don't die so omFlush is never called for them (they live happily for ever after in a thread pool).
>
> We spawn those 80 inserting threads via FixedThreadPoolExecutor and after all tasks are finished we shutdown the executor, so those 80 threads are gone, but...
>
> ...is it possible for a thread to call omRelease for a monitor that was not omAlloc-ated by this same thread? In this case object monitors, allocated by one thread would get accumulated on a free list of another thread and if this thread is a long-running thread those monitors would stay there forever...
>
> Here's a test program that on my machine (i3 CPU, Linux) accumulates arround 1.6M monitors:
>
>
> package monitors;
>
> import java.util.concurrent.BlockingQueue;
> import java.util.concurrent.SynchronousQueue;
>
> /**
>   * @author Peter Levart
>   * @created 5/25/11 @ 3:35 PM
>   */
> public class MonitorsExplosion
> {
>      BlockingQueue<Object>  queue = new SynchronousQueue<Object>();
>
>      class Allocator extends Thread
>      {
>          final int allocations;
>
>          Allocator(int allocations)
>          {
>              this.allocations = allocations;
>          }
>
>          @Override
>          public void run()
>          {
>              for (int i = 0; i<  allocations; i++)
>              {
>                  Object o = new Object();
>                  synchronized (o)
>                  {
>                      try
>                      {
>                          queue.put(o);
>                          Thread.yield();
>                      }
>                      catch (InterruptedException e)
>                      {
>                          // ignore
>                      }
>                  }
>              }
>          }
>      }
>
>      class Collector extends Thread
>      {
>          @Override
>          public void run()
>          {
>              int hashSum = 0;
>
>              for (; ; )
>              {
>                  try
>                  {
>                      Object o = queue.take();
>                      synchronized (o)
>                      {
>                          hashSum += o.hashCode();
>                      }
>                  }
>                  catch (InterruptedException e)
>                  {
>                      // ignore
>                  }
>              }
>          }
>      }
>
>      void detonate() throws InterruptedException
>      {
>          Allocator[] allocators = new Allocator[8];
>          for (int i = 0; i<  allocators.length; i++)
>          {
>              Allocator allocator = new Allocator(1000000);
>              allocator.start();
>              allocators[i] = allocator;
>          }
>
>          System.out.println("Allocators started");
>
>          Collector collector = new Collector();
>          collector.start();
>
>          System.out.println("Collector started");
>
>          for (Allocator allocator : allocators)
>              allocator.join();
>
>          System.out.println("Allocators finished");
>
>          collector.join(); // never ending...
>
>          System.out.println("Collector finished");
>      }
>
>      public static void main(String[] args) throws InterruptedException
>      {
>          new MonitorsExplosion().detonate();
>      }
> }
>
>
> ...I don't know if they get returned to global free list or if they just lie there on Collectors private list though.
>
>
> Regards,
>
> Peter Levart



More information about the hotspot-runtime-dev mailing list