Hi John --<br><br>Good catch... I suppose we have been getting too large PLABs for so many years and perhaps wasting space to fragmentation as a result of this so far (hmm.. although perhaps the waste calculation in the next cycle allows us to adjust down, but i guess we are prone to oscillate because of the sensor value corruption...) I can imagine that in the case of ParNew+CMS this has been wasting space (because of said oscillations) in the survivor spaces.<br>
<br>Thanks for the fix, and it looks good to me too. Any chance that the PLAB stats portion of the fix at least might also be backported to JDK 7, so the performance benefit of a more correct calculation accrues there as well? May be under an MR? (PS: I am curious to know if this showed any change in (gc) performance on any of the usual benchmarks...)<br>
<br>thanks!<br>-- ramki<br><br><div class="gmail_quote">On Tue, Aug 28, 2012 at 1:20 PM, John Cuthbertson <span dir="ltr"><<a href="mailto:john.cuthbertson@oracle.com" target="_blank">john.cuthbertson@oracle.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Jon,<br>
<br>
Thanks for the review.<br>
<br>
JohnC<div class="HOEnZb"><div class="h5"><br>
<br>
On 08/28/12 10:00, Jon Masamitsu wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Looks good.<br>
<br>
On 8/27/2012 5:36 PM, John Cuthbertson wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi Everyone,<br>
<br>
Can I have a couple of volunteers review the changes for this CR? The webrev can be found at: <a href="http://cr.openjdk.java.net/%7Ejohnc/7190666/webrev.0/" target="_blank">http://cr.openjdk.java.net/~<u></u>johnc/7190666/webrev.0/</a><br>
<br>
Summary:<br>
The value of PLABStats::_allocated was overflowing and the failed assertion detected when it overflowed to 0. When we retired an individual allocation buffer, we were flushing some accumulated values to the associated PLABStats instance. This was artificially inflating the values in the PLABStats instance since we were not reseting the accumulated values in the ParGCAllocBuffer after we flushed. Ordinarily this would not cause an issue (other than the values being too large) but with this particular test case we obtained an evacuation failure. As a result we were executing the GC allocation slow-path, and flushing the accumulated values, for every failed attempted object allocation (even though we were unable to allocate a new buffer), and we overflowed. Reseting the sensor values in the ParGCAllocBuffer instance after flushing prevents the artificial inflation and overflow.<br>
<br>
Additionally we should not be flushing the values to the PLABStats instance on every buffer retirement (though it is not stated in the code). Flushing the stats values on every retirement is unnecessary and, in the case of an evacuation, adds a fair amount of additional work for each failed object copy. Instead we should only be flushing the accumulated sensor values when we retire the final buffers prior to disposing the G1ParScanThreadState object.<br>
<br>
Testing:<br>
The failing test case; the GC test suite with +PrintPLAB, and jprt.<br>
<br>
Note while testing this I ran into some assertion and guarantee failures from G1's block offset table. I've only seen and been able (so far) to reproduce these failures on a single machine in the jprt pool. I will be submitting a new CR for these failures. I do not believe that the failures are related to this fix (or the change that enabled resize-able PLABS) as I've been able to reproduce the failures with disabling ResizePLAB and setting OldPLABSize=8k, 16k, and 32k.<br>
<br>
Thanks,<br>
<br>
JohnC<br>
</blockquote></blockquote>
<br>
</div></div></blockquote></div><br>