RFR(S): 7190666: G1: assert(_unused == 0) failed: Inconsistency in PLAB stats

Tue Aug 28 17:00:21 UTC 2012

Looks good.

On 8/27/2012 5:36 PM, John Cuthbertson wrote:
> Hi Everyone,
>
> Can I have a couple of volunteers review the changes for this CR? The 
> webrev can be found at: 
> http://cr.openjdk.java.net/~johnc/7190666/webrev.0/
>
> Summary:
> The value of PLABStats::_allocated was overflowing and the failed 
> assertion detected when it overflowed to 0. When we retired an 
> individual allocation buffer, we were flushing some accumulated values 
> to the associated PLABStats instance. This was artificially inflating 
> the values in the PLABStats instance since we were not reseting the 
> accumulated values in the ParGCAllocBuffer after we flushed. 
> Ordinarily this would not cause an issue (other than the values being 
> too large) but with this particular test case we obtained an 
> evacuation failure. As a result we were executing the GC allocation 
> slow-path, and flushing the accumulated values, for every failed 
> attempted object allocation (even though we were unable to allocate a 
> new buffer), and we overflowed. Reseting the sensor values in the 
> ParGCAllocBuffer instance after flushing prevents the artificial 
> inflation and overflow.
>
> Additionally we should not be flushing the values to the PLABStats 
> instance on every buffer retirement (though it is not stated in the 
> code). Flushing the stats values on every retirement is unnecessary 
> and, in the case of an evacuation, adds a fair amount of additional 
> work for each failed object copy. Instead we should only be flushing 
> the accumulated sensor values when we retire the final buffers prior 
> to disposing the G1ParScanThreadState object.
>
> Testing:
> The failing test case; the GC test suite with +PrintPLAB, and jprt.
>
> Note while testing this I ran into some assertion and guarantee 
> failures from G1's block offset table. I've only seen and been able 
> (so far) to reproduce these failures on a single machine in the jprt 
> pool. I will be submitting a new CR for these failures. I do not 
> believe that the failures are related to this fix (or the change that 
> enabled resize-able PLABS) as I've been able to reproduce the failures 
> with disabling ResizePLAB and setting OldPLABSize=8k, 16k, and 32k.
>
> Thanks,
>
> JohnC