<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix"><br>

      Hi Ramki,<br>

      <br>

      On 1/11/13 6:50 PM, Srinivas Ramakrishna wrote:<br>

    </div>

    <blockquote

cite="mid:CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com"

      type="cite">Hi Bengt --<br>

      <br>

      Try computing the GC overhead by normalizing wrt the work done

      (for which the net allocation volume might be a good proxy). As

      you state, the performance numbers will then likely make sense. Of

      course, they still won't explain why ParNew does better.  As

      Vitaly conjectures, the difference is likely in better object

      co-location with ParNew's slightly more DFS-like evacuation

      compared with DefNew's considerably more BFS-like evacuation

      because of the latter's use of a pure Cheney scan compared with

      the use of (a) marking stack(s) in the former, as far as i can

      remember the code. One way to tell if that accounts for the

      difference is to measure the cache-miss rates in the two cases

      (and may be use a good tool like Solaris perf analyzer to show you

      where the misses are coming from as well).<br>

    </blockquote>

    <br>

    Thanks for bringing the DFS/BFS difference up. This is exactly the

    kind of difference I was looking for. My guess is that this is what

    causes the difference in JBB score. I'll see if I can investigate

    this further.<br>

    <blockquote

cite="mid:CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com"

      type="cite">Also curious if you can share the two sets of GC logs,

      by chance? (specJBB is a for-fee benchmark so is not freely

      available to the individual developer.)<br>

    </blockquote>

    <br>

    I have a fairly large set of logs, but the runs are very stable so

    I'm just attaching logs for one run for each collector. For

    comparison I have also been running ParallelScavenge with one

    thread. I'm using separate gc logs and jbb logs. The log files

    called ".result" are the jbb output. The other logs are the gc logs.<br>

    <br>

    I'm running with a heap size of 1GB to avoid full GCs. All runs have

    the two System.gc() induced full GCs but no other. ParallelScavenge

    is performing even better than ParNew, but I am mostly interested in

    the difference between ParNew and DefNew.<br>

    <br>

    A quick summary of the data in the logs:<br>

    <tt><br>

    </tt><tt>        Score  #GCs  AverageGCTime</tt><tt><br>

    </tt><tt>DefNew: 57903  2083</tt><tt>  0.044053195391262644</tt><tt><br>

    </tt><tt>ParNew: 61363  2213  0.05931835969272489</tt><tt><br>

    </tt><tt>PS:     69697  2213  0.06117092860370538</tt><br>

    <br>

    ParNew has a better score even though it does more GCs and they take

    longer.<br>

    <br>

    If you have any insights from looking at the logs I would be very

    happy to hear about it.<br>

    <br>

    Thanks,<br>

    Bengt<br>

    <blockquote

cite="mid:CABzyjynNXLidiWVRb_030+Wq=LUdnnb3qGvxM5LZL=B6rD=oUA@mail.gmail.com"

      type="cite"><br>

      thanks.<br>

      -- ramki<br>

      <br>

      <div class="gmail_quote">On Fri, Jan 11, 2013 at 4:57 AM, Bengt

        Rutisson <span dir="ltr"><<a moz-do-not-send="true"

            href="mailto:bengt.rutisson@oracle.com" target="_blank">bengt.rutisson@oracle.com</a>></span>

        wrote:<br>

        <blockquote class="gmail_quote" style="margin:0 0 0

          .8ex;border-left:1px #ccc solid;padding-left:1ex">

          <div bgcolor="#FFFFFF" text="#000000">

            <div><br>

              Hi Vitaly,

              <div class="im"><br>

                <br>

                On 1/11/13 1:45 PM, Vitaly Davidovich wrote:<br>

              </div>

            </div>

            <div class="im">

              <blockquote type="cite">

                <p dir="ltr">Hi Bengt,</p>

                <p dir="ltr">Regarding the benchmark score, are you

                  saying ParNew has longer cumulative GC time or just

                  the average is higher? If it's just average, maybe the

                  total # of them (and cumulative time) is less.  I

                  don't know the characteristics of this particular

                  specjbb benchmark, but perhaps having fewer total GCs

                  is better because of the overhead of getting all

                  threads to a safe point, going go the OS to suspend

                  them, and then restarting them.  After they're

                  restarted, the CPU cache may be cold for it because

                  the GC thread polluted it.  Or I'm entirely wrong in

                  my speculation ... :).</p>

              </blockquote>

              <br>

            </div>

            You have a good point about the number of GCs. The problem

            in my runs is that ParNew does more GCs than DefNew. So

            there are both more of them and their average time is

            higher, but the score is still better. That ParNew does more

            GCs is not that strange. It has a higher score, which means

            that it had higher throughput and had time to create more

            objects. So, that is kind of expected. But I don't

            understand how it can have higher throughput when the GCs

            take longer. My current guess is that it does something

            differently with how objects are copied in a way that is

            beneficial for the execution time between GCs.<br>

            <br>

            It also seems like ParNew keeps more objects alive for each

            GC. That is either the reason why it does more and more

            frequent GCs than DefNew, or it is an effect of the fact

            that more objects are created due to the higher throughput.

            This is the reason I started looking at the tenuring

            threshold.<span class="HOEnZb"><font color="#888888"><br>

                <br>

                Bengt</font></span>

            <div>

              <div class="h5"><br>

                <br>

                <blockquote type="cite">

                  <p dir="ltr">Thanks</p>

                  <p dir="ltr">Sent from my phone</p>

                  <div class="gmail_quote">On Jan 11, 2013 6:02 AM,

                    "Bengt Rutisson" <<a moz-do-not-send="true"

                      href="mailto:bengt.rutisson@oracle.com"

                      target="_blank">bengt.rutisson@oracle.com</a>>

                    wrote:<br type="attribution">

                    <blockquote class="gmail_quote" style="margin:0 0 0

                      .8ex;border-left:1px #ccc solid;padding-left:1ex">

                      <div bgcolor="#FFFFFF" text="#000000">

                        <div><br>

                          Hi Ramki,<br>

                          <br>

                          Thanks for looking at this!<br>

                          <br>

                          On 1/10/13 9:28 PM, Srinivas Ramakrishna

                          wrote:<br>

                        </div>

                        <blockquote type="cite">Hi Bengt --<br>

                          <br>

                          The change looks reasonable, but I have a

                          comment and a follow-up question.<br>

                          <br>

                          Not your change, but I'd elide the "half the

                          real survivor size" since it's really a

                          configurable parameter based on

                          TargetSurvivorRatio with default half.<br>

                          I'd leave the comment as "set the new tenuring

                          threshold and desired survivor size".<br>

                        </blockquote>

                        <br>

                        I'm fine with removing this from the comment,

                        but I thought the "half the real survivor size"

                        aimed at the fact that we pass only the "to"

                        capacity and not the "from" capacity in to

                        compute_tenuring_threshold(). With that

                        interpretation I think the comment is correct.<br>

                        <br>

                        Would you like me to remove it anyway? Either

                        way is fine with me.<br>

                        <br>

                        <blockquote type="cite">I'm curious though, as

                          to what performance data prompted this change,</blockquote>

                        Good point. This change was preceded by an

                        internal discussion in the GC team, so I should

                        probably have explained the background more in

                        my review request to the open.<br>

                        <br>

                        I was comparing the ParNew and DefNew

                        implementation since I am seeing some strange

                        differences in some SPECjbb2005 results. I am

                        running ParNew with a single thread and get much

                        better score than with DefNew. But I also get

                        higher average GC times. So, I was trying to

                        figure out what DefNew and ParNew does

                        differently.<br>

                        <br>

                        When I was looking at

                        DefNewGeneration::collect() and

                        ParNewGeneration::collect() I saw that they

                        contain a whole lot of code duplication. It

                        would be tempting to try to extract the common

                        code out into DefNewGeneration since it is the

                        super class. But there are some minor

                        differences. One of them was this issue with how

                        they handle the tenuring threshold.<br>

                        <br>

                        We tried to figure out if there is a reason for

                        ParNew and DefNew to behave different in this

                        regard. We could not come up with any good

                        reason for that. So, we needed to figure out if

                        we should change ParNew or DefNew to make them

                        consistent. The decision to change ParNew was

                        based on two things. First, it seems wrong to

                        use the data from a collection that got

                        promotion failure. This collection will not have

                        allowed the tenuring threshold to fulfill its

                        purpose. Second, ParallelScavenge works the same

                        way as DefNew.<br>

                        <br>

                        BTW, the difference between DefNew and ParNew

                        seems to have been there from the start. So,

                        there is no bug or changeset in mercurial or

                        TeamWare to explain why the difference was

                        introduced. <br>

                        <br>

                        (Just to be clear, this difference was not the

                        cause of my performance issue. I still don't

                        have a good explanation for how ParNew can have

                        longer GC times but better SPECjbb score.)<br>

                        <br>

                        <blockquote type="cite">and whether it might

                          make sense, upon a promotion failure to do

                          something about the tenuring threshold for the

                          next scavenge (i.e. for example make the

                          tenuring threshold half of its current value

                          as a reaction to the fact that promotion

                          failed). Is it currently left at its previous

                          value or is it asjusted back to the default

                          max value (which latter may be the wrong thing

                          to do) or something else?<br>

                        </blockquote>

                        <br>

                        As far as I can tell the tenuring threshold is

                        left untouched if we get a promotion failure. It

                        is probably a good idea to update it in some

                        way. But I would prefer to handle that as a

                        separate bug fix.<br>

                        <br>

                        This change is mostly a small cleanup to make

                        DefNewGeneration::collect() and

                        ParNewGeneration::collect() be more consistent.

                        We've done the thinking so, it's good to make

                        the change in preparation for the next person

                        that comes a long and has a few cycles over and

                        would like to merge the two collect() methods in

                        some way.<br>

                        <br>

                        Thanks again for looking at this!<br>

                        Bengt<br>

                        <br>

                        <blockquote type="cite"> <br>

                          -- ramki<br>

                          <br>

                          <div class="gmail_quote">On Thu, Jan 10, 2013

                            at 1:30 AM, Bengt Rutisson <span dir="ltr"><<a

                                moz-do-not-send="true"

                                href="mailto:bengt.rutisson@oracle.com"

                                target="_blank">bengt.rutisson@oracle.com</a>></span>

                            wrote:<br>

                            <blockquote class="gmail_quote"

                              style="margin:0 0 0 .8ex;border-left:1px

                              #ccc solid;padding-left:1ex"><br>

                              Hi everyone,<br>

                              <br>

                              Could I have a couple of reviews for this

                              small change to make DefNew and ParNew be

                              more consistent in the way they treat the

                              tenuring threshold:<br>

                              <br>

                              <a moz-do-not-send="true"

                                href="http://cr.openjdk.java.net/%7Ebrutisso/8005972/webrev.00/"

                                target="_blank">http://cr.openjdk.java.net/~brutisso/8005972/webrev.00/</a><br>

                              <br>

                              Thanks,<br>

                              Bengt<br>

                            </blockquote>

                          </div>

                          <br>

                        </blockquote>

                        <br>

                      </div>

                    </blockquote>

                  </div>

                </blockquote>

                <br>

              </div>

            </div>

          </div>

        </blockquote>

      </div>

      <br>

    </blockquote>

    <br>

  </body>

</html>