RFR: 8034852: Shrinking of Metaspace high-water-mark causes incorrect OutOfMemoryErrors or back-to-back GCs

Mon May 5 14:06:52 UTC 2014

Hi Jon,

thanks for having a look at the patch!

On 2014-05-01 19:34, Jon Masamitsu wrote:
> Erik,
>
> Change looks good.

Thanks!

On 2014-05-01 19:34, Jon Masamitsu wrote:
> Please add a comment explaining why we're using committed now and why
> capacity did not work.  Basically an abbreviated version of your
> explanation
> below.

The old code in compute_new_size was the last code in metaspace making 
use of "capacity" as a concept. My idea was to remove "capacity" 
completely from the metaspace code after this change.

Do you think that we still should add a comment discussing a concept
(capacity) that the code no longer will use? Is there a possibility
that people that are new to the code will become confused?

Thanks,
Erik

On 2014-05-01 19:34, Jon Masamitsu wrote:
> Thanks.
>
> Jon
>
> On 4/30/2014 4:18 AM, Erik Helin wrote:
>> Hi all,
>>
>> this patch solves a rather tricky problem with the sizing of Metaspace.
>>
>> The issue happens when the GC threshold for Metaspace (called
>> "capacity_until_GC" in the code) becomes less than the committed
>> memory for Metaspace. Any calls to Metaspace::allocate that requires
>> committing more memory will then fail in
>> MetaspaceGC::allowed_expansion, because capacity_until_GC() <
>> MetaspaceAux::committed_memory(). The effect will be a
>> full GC and after the GC we try to expand and allocate. After the
>> expansion and before the allocation, one of two things can happen:
>>  1. capacity_until_GC is larger than the committed memory after the
>>     expansion. The allocation will now succeed, but the next allocation
>>     requiring a new chunk will *again* trigger a full GC. This pattern
>>     will repeat itself for each new allocation request requiring a new
>>     chunk.
>>  2. capacity_until_GC is still less than the committed memory even
>>     after the expansion. We throw a Java OOME (incorrectly).
>>
>> How can the GC threshold for Metaspace be less than the committed
>> memory? The problem is that MetaspaceGC::compute_new_size uses the field
>> _allocated_capacity for describing the amount of memory in Metaspace
>> that is "in use". _allocated_capacity does not consider the memory in
>> the chunk free lists to be "in use", since memory in the chunk free
>> lists are supposed to be available for new allocations. The problem is
>> that the chunk free lists can become fragmented, and then the memory
>> is not available for all kinds of allocations.
>>
>> This patch change MetaspaceGC::compute_new_size to use
>> MetaspaceAux::committed_memory for describing how much memory that is
>> "in use". The effect will be that memory in the chunk free lists will no
>> longer be considered "in use" (but will of course be used for future
>> allocations where possible). This will prevent capacity_until_GC from
>> shrinking below the committed memory "by definiton", since
>> capacity_until_GC can't be lower than the memory that is "in use".
>>
>> Based on the results from the perf testing (see below), this change
>> has no performance impact.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~ehelin/8034852/webrev.00/
>>
>> Testing:
>> - JPRT
>> - Ad-hoc testing:
>>   - Kitchensink
>>   - Dacapo
>>   - Medrec
>>   - runThese
>>   - Parallel Class Loading testlist
>>   - Metaspace testlist
>>   - GC nightly testlist
>> - Perf testing:
>>   - SPECjbb2005
>>   - SPECjbb2013
>>   - Derby
>> - Derby regression tests
>>
>> Thanks,
>> Erik
>