RFR: 8034852: Shrinking of Metaspace high-water-mark causes incorrect OutOfMemoryErrors or back-to-back GCs

Erik Helin erik.helin at oracle.com
Wed Apr 30 11:18:09 UTC 2014


Hi all,

this patch solves a rather tricky problem with the sizing of Metaspace.

The issue happens when the GC threshold for Metaspace (called
"capacity_until_GC" in the code) becomes less than the committed memory 
for Metaspace. Any calls to Metaspace::allocate that requires committing 
more memory will then fail in MetaspaceGC::allowed_expansion, because 
capacity_until_GC() < MetaspaceAux::committed_memory(). The effect will be a
full GC and after the GC we try to expand and allocate. After the 
expansion and before the allocation, one of two things can happen:
  1. capacity_until_GC is larger than the committed memory after the
     expansion. The allocation will now succeed, but the next allocation
     requiring a new chunk will *again* trigger a full GC. This pattern
     will repeat itself for each new allocation request requiring a new
     chunk.
  2. capacity_until_GC is still less than the committed memory even
     after the expansion. We throw a Java OOME (incorrectly).

How can the GC threshold for Metaspace be less than the committed 
memory? The problem is that MetaspaceGC::compute_new_size uses the field
_allocated_capacity for describing the amount of memory in Metaspace 
that is "in use". _allocated_capacity does not consider the memory in 
the chunk free lists to be "in use", since memory in the chunk free 
lists are supposed to be available for new allocations. The problem is 
that the chunk free lists can become fragmented, and then the memory is 
not available for all kinds of allocations.

This patch change MetaspaceGC::compute_new_size to use
MetaspaceAux::committed_memory for describing how much memory that is 
"in use". The effect will be that memory in the chunk free lists will no
longer be considered "in use" (but will of course be used for future 
allocations where possible). This will prevent capacity_until_GC from 
shrinking below the committed memory "by definiton", since 
capacity_until_GC can't be lower than the memory that is "in use".

Based on the results from the perf testing (see below), this change has 
no performance impact.

Webrev:
http://cr.openjdk.java.net/~ehelin/8034852/webrev.00/

Testing:
- JPRT
- Ad-hoc testing:
   - Kitchensink
   - Dacapo
   - Medrec
   - runThese
   - Parallel Class Loading testlist
   - Metaspace testlist
   - GC nightly testlist
- Perf testing:
   - SPECjbb2005
   - SPECjbb2013
   - Derby
- Derby regression tests

Thanks,
Erik


More information about the hotspot-dev mailing list