<br><br><div class="gmail_quote">On Wed, Aug 29, 2012 at 10:23 AM, John Cuthbertson <span dir="ltr"><<a href="mailto:john.cuthbertson@oracle.com" target="_blank">john.cuthbertson@oracle.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<u></u>

  <div bgcolor="#ffffff" text="#000000">

    Hi Ramki,<br>

    <br>

    Thanks for the review.<br>

    <br>

    I don't think we need to back port the change in PLABStats.

    CMS/ParNew was only flushing the accumulated values once rather than

    at every allocation buffer retirement and wasn't affected by the

    "double" add that inflated the value. </div></blockquote><div><br>ah, you are right ...<br> <br></div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<div bgcolor="#ffffff" text="#000000">The real fix for G1 was to do

    the same. The only reason for the change in PLABStats was to make it

    a bit more robust so that we could flush the stats every n-th buffer

    retirement if we wanted.<br></div></blockquote><div><br>Makes sense, and sorry for my confusion.<br> <br></div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<div bgcolor="#ffffff" text="#000000">

    <br>

    I haven't run performance numbers yet - I've been distracted with

    the assertions and guarantee failures from G1's BOT that I've seen

    with a jprt-built binary. It is a good idea to run refworkload. My

    expectation is that ParNew/CMS will be a wash; with G1 I would

    expect the deviation to narrow.<br></div></blockquote><div><br>I'd agree wrt parnew/cms, now that i understand the code better in light of your correction of my confusion.<br><br>thanks!<br>-- ramki<br> <br></div>

<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div bgcolor="#ffffff" text="#000000">

    <br>

    Thanks,<br>

    <br>

    JohnC<div><div class="h5"><br>

    <br>

    On 8/29/2012 1:43 AM, Srinivas Ramakrishna wrote:

    <blockquote type="cite">Hi John --<br>

      <br>

      Good catch... I suppose we have been getting too large PLABs for

      so many years and perhaps wasting space to fragmentation as a

      result of this so far (hmm.. although perhaps the waste

      calculation in the next cycle allows us to adjust down, but i

      guess we are prone to oscillate because of the sensor value

      corruption...) I can imagine that in the case of ParNew+CMS this

      has been wasting space (because of said oscillations) in the

      survivor spaces.<br>

      <br>

      Thanks for the fix, and it looks good to me too. Any chance that

      the PLAB stats portion of the fix at least might also be

      backported to JDK 7, so the performance benefit of a more correct

      calculation accrues there as well? May be under an MR? (PS: I am

      curious to know if this showed any change in (gc) performance on

      any of the usual benchmarks...)<br>

      <br>

      thanks!<br>

      -- ramki<br>

      <br>

      <div class="gmail_quote">On Tue, Aug 28, 2012 at 1:20 PM, John

        Cuthbertson <span dir="ltr"><<a href="mailto:john.cuthbertson@oracle.com" target="_blank">john.cuthbertson@oracle.com</a>></span>

        wrote:<br>

        <blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Jon,<br>

          <br>

          Thanks for the review.<br>

          <br>

          JohnC

          <div>

            <div><br>

              <br>

              On 08/28/12 10:00, Jon Masamitsu wrote:<br>

              <blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                Looks good.<br>

                <br>

                On 8/27/2012 5:36 PM, John Cuthbertson wrote:<br>

                <blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                  Hi Everyone,<br>

                  <br>

                  Can I have a couple of volunteers review the changes

                  for this CR? The webrev can be found at: <a href="http://cr.openjdk.java.net/%7Ejohnc/7190666/webrev.0/" target="_blank">http://cr.openjdk.java.net/~johnc/7190666/webrev.0/</a><br>

                  <br>

                  Summary:<br>

                  The value of PLABStats::_allocated was overflowing and

                  the failed assertion detected when it overflowed to 0.

                  When we retired an individual allocation buffer, we

                  were flushing some accumulated values to the

                  associated PLABStats instance. This was artificially

                  inflating the values in the PLABStats instance since

                  we were not reseting the accumulated values in the

                  ParGCAllocBuffer after we flushed. Ordinarily this

                  would not cause an issue (other than the values being

                  too large) but with this particular test case we

                  obtained an evacuation failure. As a result we were

                  executing the GC allocation slow-path, and flushing

                  the accumulated values, for every failed attempted

                  object allocation (even though we were unable to

                  allocate a new buffer), and we overflowed. Reseting

                  the sensor values in the ParGCAllocBuffer instance

                  after flushing prevents the artificial inflation and

                  overflow.<br>

                  <br>

                  Additionally we should not be flushing the values to

                  the PLABStats instance on every buffer retirement

                  (though it is not stated in the code). Flushing the

                  stats values on every retirement is unnecessary and,

                  in the case of an evacuation, adds a fair amount of

                  additional work for each failed object copy. Instead

                  we should only be flushing the accumulated sensor

                  values when we retire the final buffers prior to

                  disposing the G1ParScanThreadState object.<br>

                  <br>

                  Testing:<br>

                  The failing test case; the GC test suite with

                  +PrintPLAB, and jprt.<br>

                  <br>

                  Note while testing this I ran into some assertion and

                  guarantee failures from G1's block offset table. I've

                  only seen and been able (so far) to reproduce these

                  failures on a single machine in the jprt pool. I will

                  be submitting a new CR for these failures. I do not

                  believe that the failures are related to this fix (or

                  the change that enabled resize-able PLABS) as I've

                  been able to reproduce the failures with disabling

                  ResizePLAB and setting OldPLABSize=8k, 16k, and 32k.<br>

                  <br>

                  Thanks,<br>

                  <br>

                  JohnC<br>

                </blockquote>

              </blockquote>

              <br>

            </div>

          </div>

        </blockquote>

      </div>

      <br>

    </blockquote>

  </div></div></div>

</blockquote></div><br>