RFR: 8034852: Shrinking of Metaspace high-water-mark causes incorrect OutOfMemoryErrors or back-to-back GCs

Erik Helin erik.helin at oracle.com
Thu May 8 08:09:58 UTC 2014


On Monday 05 May 2014 20:31:32 PM Jon Masamitsu wrote:
> On 5/5/2014 7:06 AM, Erik Helin wrote:
> > Hi Jon,
> > 
> > thanks for having a look at the patch!
> > 
> > On 2014-05-01 19:34, Jon Masamitsu wrote:
> >> Erik,
> >> 
> >> Change looks good.
> > 
> > Thanks!
> > 
> > On 2014-05-01 19:34, Jon Masamitsu wrote:
> >> Please add a comment explaining why we're using committed now and why
> >> capacity did not work.  Basically an abbreviated version of your
> >> explanation
> >> below.
> > 
> > The old code in compute_new_size was the last code in metaspace making
> > use of "capacity" as a concept. My idea was to remove "capacity"
> > completely from the metaspace code after this change.
> > 
> > Do you think that we still should add a comment discussing a concept
> > (capacity) that the code no longer will use? Is there a possibility
> > that people that are new to the code will become confused?
> 
> Capacity is a concept that is present in the GC code.
> I really, really don't want to see capacity come back
> into the metaspace code and adding a comment is
> the only way I can think of to prevent it.

I've uploaded a new patch with a comment:
http://cr.openjdk.java.net/~ehelin/8034852/webrev.01/

I've also added the description of the problem in this reveiew request as a
comment in the bug:
https://bugs.openjdk.java.net/browse/JDK-8034852

Hopefully, if someone wants to change used_after_gc, they will read the
comment, maybe do a "hg annotate" to see the change that caused the comment,
look up the bug and see the full description of the problem
(since the comment talks about serious bugs in the past).

What do you think?

Thanks,
Erik

> Jon
> 
> > Thanks,
> > Erik
> > 
> > On 2014-05-01 19:34, Jon Masamitsu wrote:
> >> Thanks.
> >> 
> >> Jon
> >> 
> >> On 4/30/2014 4:18 AM, Erik Helin wrote:
> >>> Hi all,
> >>> 
> >>> this patch solves a rather tricky problem with the sizing of Metaspace.
> >>> 
> >>> The issue happens when the GC threshold for Metaspace (called
> >>> "capacity_until_GC" in the code) becomes less than the committed
> >>> memory for Metaspace. Any calls to Metaspace::allocate that requires
> >>> committing more memory will then fail in
> >>> MetaspaceGC::allowed_expansion, because capacity_until_GC() <
> >>> MetaspaceAux::committed_memory(). The effect will be a
> >>> full GC and after the GC we try to expand and allocate. After the
> >>> 
> >>> expansion and before the allocation, one of two things can happen:
> >>>  1. capacity_until_GC is larger than the committed memory after the
> >>>  
> >>>     expansion. The allocation will now succeed, but the next allocation
> >>>     requiring a new chunk will *again* trigger a full GC. This pattern
> >>>     will repeat itself for each new allocation request requiring a new
> >>>     chunk.
> >>>  
> >>>  2. capacity_until_GC is still less than the committed memory even
> >>>  
> >>>     after the expansion. We throw a Java OOME (incorrectly).
> >>> 
> >>> How can the GC threshold for Metaspace be less than the committed
> >>> memory? The problem is that MetaspaceGC::compute_new_size uses the
> >>> field
> >>> _allocated_capacity for describing the amount of memory in Metaspace
> >>> that is "in use". _allocated_capacity does not consider the memory in
> >>> the chunk free lists to be "in use", since memory in the chunk free
> >>> lists are supposed to be available for new allocations. The problem is
> >>> that the chunk free lists can become fragmented, and then the memory
> >>> is not available for all kinds of allocations.
> >>> 
> >>> This patch change MetaspaceGC::compute_new_size to use
> >>> MetaspaceAux::committed_memory for describing how much memory that is
> >>> "in use". The effect will be that memory in the chunk free lists
> >>> will no
> >>> longer be considered "in use" (but will of course be used for future
> >>> allocations where possible). This will prevent capacity_until_GC from
> >>> shrinking below the committed memory "by definiton", since
> >>> capacity_until_GC can't be lower than the memory that is "in use".
> >>> 
> >>> Based on the results from the perf testing (see below), this change
> >>> has no performance impact.
> >>> 
> >>> Webrev:
> >>> http://cr.openjdk.java.net/~ehelin/8034852/webrev.00/
> >>> 
> >>> Testing:
> >>> - JPRT
> >>> 
> >>> - Ad-hoc testing:
> >>>   - Kitchensink
> >>>   - Dacapo
> >>>   - Medrec
> >>>   - runThese
> >>>   - Parallel Class Loading testlist
> >>>   - Metaspace testlist
> >>>   - GC nightly testlist
> >>> 
> >>> - Perf testing:
> >>>   - SPECjbb2005
> >>>   - SPECjbb2013
> >>>   - Derby
> >>> 
> >>> - Derby regression tests
> >>> 
> >>> Thanks,
> >>> Erik



More information about the hotspot-dev mailing list