<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-15"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#ffffff" text="#000000">
    David, I missed one of the longer pauses that you'd specifically
    drawn attention to:-<br>
    <br>
    <blockquote cite="mid:4E5C88D1.9040401@oracle.com" type="cite"> On
      8/29/2011 2:25 AM, David Tavoularis wrote:
      <blockquote cite="mid:op.v0yxs0vga5r2ku@nb07-spt1.mycom-int.fr"
        type="cite">
        <meta http-equiv="Content-Type" content="text/html;
          charset=ISO-8859-15">
        <style type="text/css">
<!--
body
        {font-family:'Calibri';
        font-size:18px}
--></style>
        <div><strong>1. Here is the first pattern : a <u>61-second
              pause</u>, but I don't see any suspicious message in GC
            logs:</strong><br>
          <font face="'Consolas',monospace">...<small></small><small><br>
            </small><small>2011-08-24T10:25:07.776+0100: 44242.537: [GC
              44301.853: [ParNew</small><small><br>
            </small><small>Desired survivor size 53673984 bytes, new
              threshold 1 (max 4)</small><small><br>
            </small><small>- age 1: 99505080 bytes, 99505080 total</small><small><br>
            </small><small>: 943744K->104832K(943744K), 0.2010508
              secs] 21542906K->20852742K(50226816K), 0.2022636 secs]
              <font color="#ff0000"><strong>[Times: user=5.67 sys=0.02,
                  real=59.52 secs]</strong></font></small><small><br>
            </small></font></div>
      </blockquote>
    </blockquote>
    <br>
    If you look at the timestamps above, the GC event starts off at
    44242.537 seconds, but then the GC itself does not<br>
    commence until 44301.853 seconds, i.e. a full 59.32 seconds later.
    So the pause is associated not with<br>
    GC work itself (which is correctly reported as 202 ms), but rather
    with a preamble to the GC, perhaps<br>
    with bringing threads to a safepoint, I am guessing. Once again
    -XX:+PrintSafepointStatistics (which<br>
    i mentioned in previous email wrt the 20 s pause in the middle of
    noweher) would likely provide<br>
    some clues. I have heard apocryphal stories of -XX:+UseMembar having
    worked to get rid of<br>
    overly long safepointing pauses,. and I have heard
    -XX:-UseBiasedLocking for pauses associated<br>
    with bulk bias revocations. But, without +PrintSafepointStatistics
    data to draw inferences<br>
    from, those incantations would just constitute superstitious
    mumbo-jumbo.<br>
    <br>
    -- ramki<br>
    <br>
    <blockquote cite="mid:4E5C88D1.9040401@oracle.com" type="cite">
      <blockquote cite="mid:op.v0yxs0vga5r2ku@nb07-spt1.mycom-int.fr"
        type="cite">
        <div><font face="'Consolas',monospace"><small> </small><small>Heap
              after GC invocations=1188 (full 12):</small><small><br>
            </small><small>par new generation total 943744K, used
              104832K [0xfffffff353c00000, 0xfffffff393c00000,
              0xfffffff393c00000)</small><small><br>
            </small><small>eden space 838912K, 0% used
              [0xfffffff353c00000, 0xfffffff353c00000,
              0xfffffff386f40000)</small><small><br>
            </small><small>from space 104832K, 100% used
              [0xfffffff38d5a0000, 0xfffffff393c00000,
              0xfffffff393c00000)</small><small><br>
            </small><small>to space 104832K, 0% used
              [0xfffffff386f40000, 0xfffffff386f40000,
              0xfffffff38d5a0000)</small><small><br>
            </small><small>concurrent mark-sweep generation total
              49283072K, used 20747910K [0xfffffff393c00000,
              0xffffffff53c00000, 0xffffffff53c00000)</small><small><br>
            </small><small>concurrent-mark-sweep perm gen total 524288K,
              used 42905K [0xffffffff53c00000, 0xffffffff73c00000,
              0xffffffff73c00000)</small><small><br>
            </small><small>}</small><small><br>
            </small><small><strong><font color="#ff0000">Total time for
                  which application threads were stopped: 61.5576519
                  seconds</font></strong></small><small><br>
            </small><small>Application time: 0.0245838 seconds</small><small><br>
            </small><small>Total time for which application threads were
              stopped: 9.8331189 seconds</small><small><br>
            </small><small>Application time: 0.0012626 seconds</small><small><br>
            </small><small>Total time for which application threads were
              stopped: 0.0090404 seconds</small><small><br>
            </small><small>Application time: 0.0008943 seconds</small><small><br>
            </small><small>Total time for which application threads were
              stopped: 0.0020415 seconds</small><small><br>
            </small><small>Application time: 0.0008181 seconds</small><small><br>
            </small><small>Total time for which application threads were
              stopped: 0.2338605 seconds</small><small><br>
            </small><small>Application time: 0.0018822 seconds</small></font><br>
          <br>
          The only suspicious thing is <font
            face="'Consolas',monospace"><small>"[Times: user=5.67
              sys=0.02, real=59.52 secs]"</small></font>, which means
          that the "real" duration is a lot higher than "user" CPU time.<br>
          Because "sys" duration is low, it also means that the server
          is not swapping.<br>
        </div>
        <div>What could explain this 61 seconds pause ?  <br>
        </div>
      </blockquote>
    </blockquote>
  </body>
</html>