RFR(S): 7190666: G1: assert(_unused == 0) failed: Inconsistency in PLAB stats
John Cuthbertson
john.cuthbertson at oracle.com
Tue Aug 28 20:20:09 UTC 2012
Hi Jon,
Thanks for the review.
JohnC
On 08/28/12 10:00, Jon Masamitsu wrote:
> Looks good.
>
> On 8/27/2012 5:36 PM, John Cuthbertson wrote:
>> Hi Everyone,
>>
>> Can I have a couple of volunteers review the changes for this CR? The
>> webrev can be found at:
>> http://cr.openjdk.java.net/~johnc/7190666/webrev.0/
>>
>> Summary:
>> The value of PLABStats::_allocated was overflowing and the failed
>> assertion detected when it overflowed to 0. When we retired an
>> individual allocation buffer, we were flushing some accumulated
>> values to the associated PLABStats instance. This was artificially
>> inflating the values in the PLABStats instance since we were not
>> reseting the accumulated values in the ParGCAllocBuffer after we
>> flushed. Ordinarily this would not cause an issue (other than the
>> values being too large) but with this particular test case we
>> obtained an evacuation failure. As a result we were executing the GC
>> allocation slow-path, and flushing the accumulated values, for every
>> failed attempted object allocation (even though we were unable to
>> allocate a new buffer), and we overflowed. Reseting the sensor values
>> in the ParGCAllocBuffer instance after flushing prevents the
>> artificial inflation and overflow.
>>
>> Additionally we should not be flushing the values to the PLABStats
>> instance on every buffer retirement (though it is not stated in the
>> code). Flushing the stats values on every retirement is unnecessary
>> and, in the case of an evacuation, adds a fair amount of additional
>> work for each failed object copy. Instead we should only be flushing
>> the accumulated sensor values when we retire the final buffers prior
>> to disposing the G1ParScanThreadState object.
>>
>> Testing:
>> The failing test case; the GC test suite with +PrintPLAB, and jprt.
>>
>> Note while testing this I ran into some assertion and guarantee
>> failures from G1's block offset table. I've only seen and been able
>> (so far) to reproduce these failures on a single machine in the jprt
>> pool. I will be submitting a new CR for these failures. I do not
>> believe that the failures are related to this fix (or the change that
>> enabled resize-able PLABS) as I've been able to reproduce the
>> failures with disabling ResizePLAB and setting OldPLABSize=8k, 16k,
>> and 32k.
>>
>> Thanks,
>>
>> JohnC
More information about the hotspot-gc-dev
mailing list