<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#ffffff" text="#000000">
    A long shot, but...<br>
    <br>
    If this is a recent version of Linux, perhaps you're running the
    Completely<br>
    Fair Scheduler and this is an artifact.  If so, try reverting to the
    old scheduler.<br>
    <br>
    Paul<br>
    <br>
    On 5/5/11 3:11 PM, Shane Cox wrote:
    <blockquote
      cite="mid:BANLkTikTOhmFNQeViDr0Lr=3wMAhfjCYvA@mail.gmail.com"
      type="cite">Jon,<br>
      <br>
      Thanks for the suggestion.  Interestingly enough, adding
      -XX:+AlwaysPreTouch increased the frequency of the problem by 2x. 
      Although this didn't solve the problem, I think it offers a clue.<br>
      <br>
      Touching all of the memory pages on startup will leave fewer pages
      on the free list, no?  Could fewer pages on the free list trigger
      an OS behavior that interferes with an activity in the GC
      prologue?  All that comes to mind is the linux page scanner
      (recycling pages or pruning the page cache) - not sure how this
      could interfere with the GC prologue.<br>
      <br>
      According to the free command, swap usage remained 0 throughout
      the test ... so I'm not suspecting swap at this time.<br>
      <br>
      FYI, I ran the application/test on a Solaris 10 host and was
      unable to recreate the problem.  So the problem appears to be
      linux-specific (RHEL).<br>
      <br>
      <br>
      Any additional thoughts would be appreciated.<br>
      <br>
      Shane<br>
      <br>
      <br>
      <div class="gmail_quote">On Wed, Apr 27, 2011 at 12:34 AM, Jon
        Masamitsu <span dir="ltr"><<a moz-do-not-send="true"
            href="mailto:jon.masamitsu@oracle.com">jon.masamitsu@oracle.com</a>></span>
        wrote:<br>
        <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
          0.8ex; border-left: 1px solid rgb(204, 204, 204);
          padding-left: 1ex;">Shane,<br>
          <br>
          Have you tried running with -XX:+AlwaysPreTouch ?  We've
          occasionally<br>
          seen  intermittent<br>
          long pauses as the heap grows into newly committed pages.
           This flag<br>
          causes pages<br>
          to be touched as they are committed.  I don't know how this
          fits into<br>
          Ramki's<br>
          observation but  it might be worth a shot.<br>
          <font color="#888888"><br>
            Jon<br>
          </font>
          <div>
            <div class="h5"><br>
              <br>
              On 4/26/2011 12:40 PM, Y. S. Ramakrishna wrote:<br>
              > Well-spotted; it's a version of the same problem as
              near as<br>
              > i can tell. Please make sure to include a sizable GC
              log with<br>
              > your bug report (starting from VM start-up, so we can
              see if<br>
              > there is any clue in when the problem first starts
              during<br>
              > the life of the VM).<br>
              ><br>
              > thanks.<br>
              > -- ramki<br>
              ><br>
              > On 04/26/11 11:29, Shane Cox wrote:<br>
              >> Below is an example from a Remark.  Of the total
              1.3 seconds of elapsed<br>
              >> time, 1.2 seconds is found between the first two
              timestamps.  However,<br>
              >> I'm not savvy enough to know whether this is the
              same problem or simply<br>
              >> the result of a long scavenge that occurs as part
              of the Remark.  Is<br>
              >> there any way to tell?<br>
              >><br>
              >> 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG
              occupancy: 712500 K<br>
              >> (943744 K)]9467.353: [Rescan (parallel) ,
              0.0106370 secs]9467.374: [weak<br>
              >> refs processing, 0.0159250 secs]9467.390: [class
              unloading, 0.0180420<br>
              >> secs]9467.408: [scrub symbol&  string tables,
              0.0458500 secs] [1<br>
              >> CMS-remark: 12520949K(24117248K)]
              13233450K(25060992K), 0.1052950 secs]<br>
              >> [Times: user=0.13 sys=0.01, real=1.32 secs]<br>
              >><br>
              >><br>
              >> On Tue, Apr 26, 2011 at 2:17 PM, Y. S.
              Ramakrishna<br>
              >> <<a moz-do-not-send="true"
                href="mailto:y.s.ramakrishna@oracle.com">y.s.ramakrishna@oracle.com</a><mailto:<a
                moz-do-not-send="true"
                href="mailto:y.s.ramakrishna@oracle.com">y.s.ramakrishna@oracle.com</a>>>
               wrote:<br>
              >><br>
              >>      I had a quick look and all i could find was
              the GC prologue<br>
              >>      code (although i didn't look all that
              carefully).<br>
              >>      Bascially, GC is invoked, it prints this
              timestamp,<br>
              >>      does a bit of global book-keeping and some
              initialization,<br>
              >>      and then goes over each generation in the
              heap and<br>
              >>      says "i am going to do a collection, do
              whatever you need<br>
              >>      to do before i do the collection", and the
              generations each do a bit of<br>
              >>      book-keeping and any relevant
              initialization.<br>
              >><br>
              >>      The only thing i can see in the gc prologues
              other than a bit<br>
              >>      of lightweight book-keeping is some
              reporting code that could<br>
              >>      potentially be heavyweight. But you do not
              have any of those<br>
              >>      enabled in your option set, so there should
              not be anything<br>
              >>      obviously heavyweight going on.<br>
              >><br>
              >>      I'd suggest filing a bug under the category
              of<br>
              >>      jvm/hotspot/garbage_collector<br>
              >>      so someone in support can work with you to
              get this diagnosed...<br>
              >><br>
              >>      Three questions when you file the bug:<br>
              >>      (1) have you seen this start happening
              recently? (version?)<br>
              >>      (2) can you check if the longer pauses are
              "random" or do<br>
              >>         they always happen "during" CMS
              concurrent cycles or<br>
              >>         always outside of such cycles?<br>
              >>      (3) test set-up.<br>
              >><br>
              >>      -- ramki<br>
              >><br>
              >><br>
              >>      On 04/26/11 10:45, Y. S. Ramakrishna wrote:<br>
              >><br>
              >>          The pause is definitely in the
              beginning, before GC collection code<br>
              >>          itself runs; witness the timestamps:-<br>
              >><br>
              >>          2011-04-26T12:50:45.251-0400: 2120.686:
              [GC 2123.075: [ParNew:<br>
              >>          943744K->79296K(943744K), 0.0559560
              secs]<br>
              >>          4989811K->4187520K(25060992K),
              0.0563970 secs] [Times: user=0.31<br>
              >>          sys=0.09, real=2.45 secs]<br>
              >><br>
              >>          The first timestamp is 2120.686 and the
              next one is 2123.075, so<br>
              >>          we have<br>
              >>          about 2.389 s between those two. If you
              add to that the GC time<br>
              >>          of 0.056 s,<br>
              >>          you get 2.445 which is close enough to
              the 2.45 s reported.<br>
              >><br>
              >>          So we need to figure out what happens in
              the JVM between those two<br>
              >>          time-stamps and we can at least bound
              the culprit.<br>
              >><br>
              >>          -- ramki<br>
              >><br>
              >>          On 04/26/11 10:36, Shane Cox wrote:<br>
              >><br>
              >>              Periodically, our Java app on Linux
              experiences a long Minor<br>
              >>              GC pause that cannot be accounted
              for by the GC time in the<br>
              >>              log file.  Instead, the pause is
              captured as "real" (wall<br>
              >>              clock) time and is observable in our
              application logs.  An<br>
              >>              example is below.  The GC completed
              in 56ms, but the<br>
              >>              application was paused for 2.45
              seconds.<br>
              >><br>
              >>              2011-04-26T12:50:41.722-0400:
              2117.157: [GC 2117.157:<br>
              >>              [ParNew:
              943439K->104832K(943744K), 0.0481790 secs]<br>
              >>              4909998K->4086751K(25060992K),
              0.0485110 secs] [Times:<br>
              >>              user=0.34 sys=0.03, real=0.04 secs]<br>
              >>              2011-04-26T12:50:43.882-0400:
              2119.317: [GC 2119.317:<br>
              >>              [ParNew:
              942852K->104832K(943744K), 0.0738000 secs]<br>
              >>              4924772K->4150899K(25060992K),
              0.0740980 secs] [Times:<br>
              >>              user=0.45 sys=0.12, real=0.07 secs]<br>
              >>              2011-04-26T12:50:45.251-0400:
              2120.686: [GC 2123.075:<br>
              >>              [ParNew:
              943744K->79296K(943744K), 0.0559560 secs]<br>
              >>              4989811K->4187520K(25060992K),
              0.0563970 secs] [Times:<br>
              >>              user=0.31 sys=0.09, *real=2.45
              secs]*<br>
              >>              2011-04-26T12:50:48.493-0400:
              2123.928: [GC 2123.928:<br>
              >>              [ParNew:
              918208K->81040K(943744K), 0.0396620 secs]<br>
              >>              5026432K->4189265K(25060992K),
              0.0400030 secs] [Times:<br>
              >>              user=0.32 sys=0.00, real=0.04 secs]<br>
              >>              2011-04-26T12:50:51.010-0400:
              2126.445: [GC 2126.445:<br>
              >>              [ParNew:
              919952K->104832K(943744K), 0.0845070 secs]<br>
              >>              5028177K->4268050K(25060992K),
              0.0848300 secs] [Times:<br>
              >>              user=0.52 sys=0.11, real=0.09 secs]<br>
              >><br>
              >><br>
              >>              Initially I suspected swapping, but
              according to the free<br>
              >>              command, 0 bytes of swap are in use.<br>
              >>               >free -m<br>
              >>                          total       used      
              free     shared<br>
              >>               buffers     cached<br>
              >>              Mem:         32168      28118      
              4050          0<br>
              >>               824      12652<br>
              >>              -/+ buffers/cache:      14641    
               17527<br>
              >>              Swap:         8191          0      
              8191<br>
              >><br>
              >><br>
              >>              Next, I read about a problem
              relating to mprotect() on Linux<br>
              >>              that can be worked around with
              -XX:+UseMember.  I tried<br>
              >>              that, but I still see the same
              unexplainable pauses.<br>
              >><br>
              >><br>
              >>              Any suggestions/ideas?  We've
              upgraded to the latest JDK,<br>
              >>              but no luck.<br>
              >><br>
              >>              Thanks,<br>
              >>              Shane<br>
              >><br>
              >><br>
              >>              java version "1.6.0_25"<br>
              >>              Java(TM) SE Runtime Environment
              (build 1.6.0_25-b06)<br>
              >>              Java HotSpot(TM) 64-Bit Server VM
              (build 20.0-b11, mixed mode)<br>
              >><br>
              >><br>
              >>              Linux 2.6.18-128.el5 #1 SMP Wed Jan
              21 08:45:05 EST 2009<br>
              >>              x86_64 x86_64 x86_64 GNU/Linux<br>
              >><br>
              >><br>
              >>              -verbose:gc  -Xms24g -Xmx24g -Xmn1g
              -Xss256k<br>
              >>              -XX:PermSize=256m
              -XX:MaxPermSize=256m<br>
              >>              -XX:+PrintTenuringDistribution
              -XX:+UseConcMarkSweepGC<br>
              >>              -XX:+CMSParallelRemarkEnabled<br>
              >>            
               -XX:CMSInitiatingOccupancyFraction=70<br>
              >>              -XX:+CMSClassUnloadingEnabled
              -XX:+PrintGCDetails<br>
              >>              -XX:+PrintGCDateStamps
              -XX:+PrintHeapAtGC<br>
              >>              -XX:+HeapDumpOnOutOfMemoryError
              -XX:+UseCompressedStrings<br>
              >>              -XX:+UseMembar<br>
              >><br>
              >><br>
              >>            
 ------------------------------------------------------------------------<br>
              >><br>
              >>            
               _______________________________________________<br>
              >>              hotspot-gc-use mailing list<br>
              >>              <a moz-do-not-send="true"
                href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a><br>
              >>              <mailto:<a moz-do-not-send="true"
                href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a>><br>
              >>              <a moz-do-not-send="true"
                href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use"
                target="_blank">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a><br>
              >><br>
              >>        
               _______________________________________________<br>
              >>          hotspot-gc-use mailing list<br>
              >>          <a moz-do-not-send="true"
                href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a><br>
              >>          <mailto:<a moz-do-not-send="true"
                href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a>><br>
              >>          <a moz-do-not-send="true"
                href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use"
                target="_blank">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a><br>
              >><br>
              >><br>
              > _______________________________________________<br>
              > hotspot-gc-use mailing list<br>
              > <a moz-do-not-send="true"
                href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a><br>
              > <a moz-do-not-send="true"
                href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use"
                target="_blank">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a><br>
              _______________________________________________<br>
              hotspot-gc-use mailing list<br>
              <a moz-do-not-send="true"
                href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a><br>
              <a moz-do-not-send="true"
                href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use"
                target="_blank">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a><br>
            </div>
          </div>
        </blockquote>
      </div>
      <br>
      <pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
hotspot-gc-use mailing list
<a class="moz-txt-link-abbreviated" href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a>
<a class="moz-txt-link-freetext" href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a>
</pre>
    </blockquote>
  </body>
</html>