<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Thanks, Michal for providing us with the logs. John (Cuthbertson)

    and I had a great discussion yesterday pertaining to your G1 log and

    we think we understand what's going on with your workload and have a

    experiment(s) plan for you. John is going to send in more details

    (with explanation). Your workload's pattern is very interesting to

    me/us and I have started a "work folder" for it, so that I can

    document our experiment plan and the results and keep track of the

    progress.<br>

    <br>

    Thanks for sharing your logs!<br>

    <br>

    -Monica<br>

    <br>

    On 2/1/2013 2:12 AM, Michal Frajt wrote:

    <blockquote

      cite="mid:MHJ85D$88D20B287E57CB3DA1590E5ACC6F1D19@frajt.eu"

      type="cite">

      <div class="xam_msg_class">

        <div>

          <div>Sorry, forgot to attach the archive in the last message. </div>

          <div> </div>

          <div><span>Od</span><span>:

              <a class="moz-txt-link-abbreviated" href="mailto:hotspot-gc-dev-bounces@openjdk.java.net">hotspot-gc-dev-bounces@openjdk.java.net</a></span></div>

          <div><span>Komu</span><span>: <a class="moz-txt-link-abbreviated" href="mailto:monica.beckwith@oracle.com">monica.beckwith@oracle.com</a></span></div>

          <div><span>Kopie</span><span>: <a class="moz-txt-link-abbreviated" href="mailto:hotspot-gc-dev@openjdk.java.net">hotspot-gc-dev@openjdk.java.net</a></span></div>

          <div><span>Datum</span><span>: Fri, 1 Feb 2013 09:05:58 +0100</span></div>

          <div><span>Předmet</span><span>: Re: CMS vs G1 - Scan RS very

              long</span></div>

          <br>

          <div class="xam_msg_class">

            <div class="xam_msg_class">

              <div>

                <div>> Hi Monica / John,</div>

                <div>>  </div>

                <div>> Find the new archive which correctly contains

                  both the CMS and G1 log files. Yesterday I

                  somehow managed to overwrite the CMS log file when

                  creating the tar archive. </div>

                <div>>   </div>

                <div>> Node2 - running CMS</div>

                <div>> mmreutersservernode2_20130131105521_gc.log<br>

                  > mmreutersservernode2_jinfo.txt</div>

                <div>>  </div>

                <div>> Node3 - running G1<br>

                  > mmreutersservernode3_20130131105525_gc.log<br>

                  > mmreutersservernode3_jinfo.txt</div>

                <div>>  </div>

                <div>> Regards,</div>

                <div>> Michal </div>

                <div>>  </div>

                <div><span>> Od</span><span>: "Monica Beckwith"

                    <a class="moz-txt-link-abbreviated" href="mailto:monica.beckwith@oracle.com">monica.beckwith@oracle.com</a></span></div>

                <div><span>> Komu</span><span>: "John Cuthbertson"

                    <a class="moz-txt-link-abbreviated" href="mailto:john.cuthbertson@oracle.com">john.cuthbertson@oracle.com</a></span></div>

                <div><span>> Kopie</span><span>:

                    <a class="moz-txt-link-abbreviated" href="mailto:hotspot-gc-dev@openjdk.java.net">hotspot-gc-dev@openjdk.java.net</a>,"Michal Frajt"

                    <a class="moz-txt-link-abbreviated" href="mailto:michal@frajt.eu">michal@frajt.eu</a></span></div>

                <div><span>> Datum</span><span>: Thu, 31 Jan 2013

                    12:44:46 -0600</span></div>

                <div><span>> Předmet</span><span>: Re: CMS vs G1 -

                    Scan RS very long</span></div>

                <br>

                <div class="xam_msg_class">> > Hi Michal (and

                  John),<br>

                  > > <br>

                  > > I did look at Michal's log files (BTW, both

                  are the same G1 logs). And I can confirm that scan RS

                  is the issue... here's the plot:<br>

                  > > <img alt=""

                    src="cid:part1.05060004.07020808@oracle.com"

                    _djrealurl="" height="360" width="914"><br>

                  > > <br>

                  > > The plot above only shows the max obj copy

                  times (for all pauses), max RS Scan times and Parallel

                  Time. So yes, scan RS is the culprit. <br>

                  > > Also, looking at your logs, it seems that

                  for mixedGCs the reclaimable bytes don't cross 6%. So

                  can you please try increasing the G1HeapWastePercent

                  to 10? (the latest builds will have 10 as the default

                  value).<br>

                  > > <br>

                  > > Please let me know if that improves your RT.<br>

                  > > <br>

                  > > -Monica<br>

                  > > <br>

                  > > <br>

                  > > On 1/31/2013 11:56 AM, John Cuthbertson

                  wrote:

                  <blockquote cite="mid:510AB061.6010403@oracle.com"

                    type="cite">Hi Michal, <br>

                    > > <br>

                    > > I haven't looked at the logs yet but from

                    your description it sounds like a large number of

                    the RSets have been coarsened and/or the fine grain

                    tables have gotten very dense. <br>

                    > > <br>

                    > > The RSet for a heap region tracks incoming

                    references to objects in that region. There are

                    three levels on granularity: sparse, fine, and

                    coarse. The sparse and fine entries are encoded in

                    open hash tables based upon the region containing

                    the references. As the number of references from

                    region A that point into region B increase, the

                    number of cards in the hash table entry for region A

                    increase (it's actually a bitmap with one bit per

                    card) and as the number of regions that contain

                    references that point into B increase, the number of

                    region entries in fine grain table increase. Once we

                    run out of space in the fine grain table (i.e. we

                    can't add another bitmap for another region) we

                    evict one of the densest region bitmaps and say

                    "coarsen" that entry. When we have a coarsened entry

                    we have to search the entire referencing region

                    looking for incoming references compared to

                    searching specific cards with the sparse and fine

                    entries. <br>

                    > > <br>

                    > > I'll take a look at your logs to see if I

                    can confirm. <br>

                    > > <br>

                    > > JohnC <br>

                    > > <br>

                    > > <br>

                    > > On 1/31/2013 6:12 AM, Michal Frajt wrote:

                    <br>

                    > >

                    <blockquote type="cite">Hi all, <br>

                      > > <br>

                      > > After the iCMS got officially deprecated

                      we decided to compare the G1 collector with our

                      best tuned (i)CMS setup. Unfortunately we are not

                      able to make the G1 young collection running any

                      closer to the ParNew. Actually we wanted to

                      compare the G1 concurrent marking STW pauses with

                      the CMS initial-mark and remark STW pauses but

                      already incredibly long running G1 young

                      collections are unacceptable for us. <br>

                      > > <br>

                      > > We were able to recognize that the very

                      long G1 young collections are caused by the

                      scanning remembered sets. There is not much

                      documentation about G1 internals but we were able

                      to understand that the size of the remembered sets

                      is related to the amount of mutating references

                      from old regions (cards) to young regions.

                      Unfortunately all our applications mutate

                      permanently thousands references from old objects

                      to young objects. <br>

                      > > <br>

                      > > We are testing with the latest OpenJDK7u

                      extended by the 7189971 patch and

                      CMSTriggerInterval implementation. The attached GC

                      log files represent two very equal applications

                      processing very similar data sets, one running the

                      G1, second running the CMS collector. The

                      OpenJDK7u has an extra output of _pending_cards

                      (when G1TraceConcRefinement activated) which

                      somehow relates to the remembered sets size. <br>

                      > > <br>

                      > > Young Comparison (both 128m, survivor

                      ratio 5, max tenuring 15) <br>

                      > > CMS - invoked every ~20 sec, avg. stop

                      60ms <br>

                      > > G1 - invoked every ~16 sec, avg. stop

                      410ms !!! <br>

                      > > <br>

                      > > It there anything what could help us to

                      reduce the Scan RS time or the G1 is simply not

                      targeted for applications mutating heavily old

                      region objects? <br>

                      > > <br>

                      > > CMS parameters <br>

                      > > -Xmx8884m -Xms2048m -XX:NewSize=128m

                      -XX:MaxNewSize=128m -XX:PermSize=128m

                      -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15

                      -XX:CMSMarkStackSize=8M

                      -XX:CMSMarkStackSizeMax=32M

                      -XX:+UseConcMarkSweepGC

                      -XX:+CMSClassUnloadingEnabled

                      -XX:CMSWaitDuration=60000

                      -XX:+CMSScavengeBeforeRemark

                      -XX:CMSTriggerInterval=600000 -XX:+UseParNewGC

                      -XX:ParallelGCThreads=8 -XX:ParallelCMSThreads=2 <br>

                      > > <br>

                      > > G1 parameters (mind MaxNewSize not

                      specified) <br>

                      > > -Xmx8884m -Xms2048m -XX:NewSize=128m

                      -XX:PermSize=128m -XX:SurvivorRatio=5

                      -XX:MaxTenuringThreshold=15 -XX:+UseG1GC

                      -XX:MaxGCPauseMillis=200

                      -XX:G1MixedGCCountTarget=16

                      -XX:ParallelGCThreads=8 -XX:ConcGCThreads=2 <br>

                      > > <br>

                      > > G1 log file GC young pause <br>

                      > > [GC pause (young) [G1Ergonomics (CSet

                      Construction) start choosing CSet, _pending_cards:

                      23697, predicted base time: 32.88 ms, remaining <br>

                      > > time: 167.12 ms, target pause time:

                      200.00 ms] <br>

                      > > [Parallel Time: 389.8 ms, GC Workers: 8]

                      <br>

                      > >      >>>> <br>

                      > >      [Scan RS (ms): Min: 328.8, Avg:

                      330.4, Max: 332.6, Diff: 3.8, Sum: 2642.9] <br>

                      > >      <<<< <br>

                      > > [Eden: 119.0M(119.0M)->0.0B(118.0M)

                      Survivors: 9216.0K->10.0M Heap:

                      1801.6M(2048.0M)->1685.7M(2048.0M)] <br>

                      > > <br>

                      > > Regards, <br>

                      > > Michal <br>

                      > > <br>

                      > > <br>

                      > > </blockquote>

                    <br>

                    > > </blockquote>

                  <br>

                  > >

                  <div class="moz-signature">> > -- <br>

                    > > <a moz-do-not-send="true"

                      href="http://www.oracle.com/" target="_blank"

                      _djrealurl="http://www.oracle.com/"><img

                        alt="Oracle"

                        src="cid:part2.07060601.09070600@oracle.com"

                        _djrealurl="" border="0" height="26" width="114"></a><br>

                    > > <font color="#666666" face="Verdana,

                      Arial, Helvetica, sans-serif" size="2">Monica

                      Beckwith | Java Performance Engineer<br>

                      > > VOIP: <a

                        href="tel:+1%20512%20401%201274"

                        _djrealurl="tel:+1%20512%20401%201274"

                        moz-do-not-send="true">+1 512 401 1274</a> <br>

                      > > Texas </font><br>

                    > > <a moz-do-not-send="true"

                      href="http://www.oracle.com/commitment"

                      target="_blank"

                      _djrealurl="http://www.oracle.com/commitment"><img

                        alt="Green Oracle"

                        src="cid:part5.09000504.01050906@oracle.com"

                        _djrealurl="" align="absMiddle" border="0"

                        height="28" width="44"></a> <font

                      color="#4b7d42" face="Verdana, Arial, Helvetica,

                      sans-serif" size="1">Oracle is committed to

                      developing practices and products that help

                      protect the environment</font> </div>

                </div>

              </div>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br>

    <div class="moz-signature">-- <br>

      <a href="http://www.oracle.com" target="_blank"><img

          src="cid:part7.07060205.07000209@oracle.com" alt="Oracle"

          border="0" height="26" width="114"></a><br>

      <font color="#666666" face="Verdana, Arial, Helvetica, sans-serif"

        size="2">Monica Beckwith | Java Performance Engineer<br>

        VOIP: <a href="tel:+1%20512%20401%201274"

          moz-do-not-send="true">+1 512 401 1274</a>

        <br>

        Texas </font>

      <br>

      <a href="http://www.oracle.com/commitment" target="_blank"><img

          src="cid:part10.01080905.02000304@oracle.com" alt="Green

          Oracle" align="absmiddle" border="0" height="28" width="44"></a>

      <font color="#4B7D42" face="Verdana, Arial, Helvetica, sans-serif"

        size="1">Oracle is committed to developing practices and

        products that help protect the environment</font>

      <!-- This signature was generated by the MyDesktop Oracle Business Signature utility version 4.0 -->

    </div>

  </body>

</html>