RFR: 8049599: MetaspaceGC::_capacity_until_GC can overflow

Stefan Karlsson stefan.karlsson at oracle.com
Tue Sep 30 13:11:21 UTC 2014


On 2014-09-30 14:43, Erik Helin wrote:
> All,
>
> got some great feedback from StefanK:
> - Use cmpxchg_ptr instead of cmpxchg
> - Add a comment describing the while loop in expand_and_allocate
> - Add a comment in the test describing the overflow attempt
> - Change the loop in expand_and_allocate to do/while
> - Shorten the names of the local variables in expand_and_allocate
>
> The result can be seen in the following webrevs:
> - full: http://cr.openjdk.java.net/~ehelin/8049599/webrev.02/
> - inc: http://cr.openjdk.java.net/~ehelin/8049599/webrev.01-02/

Looks good.

thanks,
StefanK

>
> Thanks,
> Erik
>
> On 2014-09-24 18:32, Erik Helin wrote:
>> All,
>>
>> I've reworked the patch quite a bit based on (great!) internal feedback
>> from StefanK and Mikael Gerdin. The patch still uses an overflow check
>> and a CAS to update the high-water mark (HWM), but the new behavior
>> should be the same as the old one (which used Atomic::add_ptr).
>>
>> With the current code, each thread always increments the HWM, but there
>> is a race in that another thread can allocate metadata (due to the
>> increased HWM) before the thread that increased the HWM gets around to
>> allocate. With the new code, each thread will increase the pointer at
>> most once using a CAS. Even if increasing the HWM fails, the allocation
>> attempt might still succeed (for the reason described above).
>>
>> There is a theoretical problem of starvation in the new code, a thread
>> might forever fail to increase the HWM and forever fail to allocate due
>> to contention, but in practice this should not be a problem. In the
>> current code, Atomic::add_ptr is implemented as a CAS in a
>> (theoretically) never ending loop on non-x86 CPUs, so the same
>> theoretical starvation problem is present in the current code as well
>> (on non-x86 CPUs that is).
>>
>> Webrevs:
>> - full:
>>    http://cr.openjdk.java.net/~ehelin/8049599/webrev.01/
>> - incremental:
>>    http://cr.openjdk.java.net/~ehelin/8049599/webrev.00-01/
>>
>> Testing:
>> - JPRT
>> - Aurora:
>>    - Kitchensink
>>    - Weblogic+Medrec
>>    - runThese
>>    - vm.quick, regression, gc, compiler, runtime, parallel class 
>> loading,
>>      metaspace, oom
>>    - JTReg tests
>> - Running newly added JTREG test
>>
>> Thanks,
>> Erik
>>
>> On 2014-08-20 12:27, Erik Helin wrote:
>>> Hi all,
>>>
>>> this patch fixes a problem where Metaspace::_capacityUntilGC can
>>> overflow ("wrap around"). Since _capacityUntilGC is treated as a size_t
>>> everywhere it used, we won't calculate with negative numbers, but an
>>> eventual wrap around will still cause unnecessary GCs.
>>>
>>> The problem is solved by detecting an eventual wrap around in
>>> Metaspace::incCapacityUntilGC. The overflow check means that
>>> _capacityUntilGC now must be updated with a CAS. If the CAS fails more
>>> than five times due to contention, no update will be done, because this
>>> means that other threads must have incremented _capacityUntilGC (it is
>>> decremented only during a safepoint). This also means that a thread
>>> calling incCapacityUntilGC might have "its" requested memory 
>>> "stolen" by
>>> another thread, but incCapacityUntilGC has never given any fairness
>>> guarantees.
>>>
>>> The patch also adds two functions to the WhiteBox API to be able to
>>> update and read Metaspace::_capacityUntilGC from a JTREG test.
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8049599
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~ehelin/8049599/webrev.00/
>>>
>>> Testing:
>>> - JPRT
>>> - Aurora ad-hoc testing (on all platforms, both 32-bit and 64-bit):
>>>    - Kitchensink, runThese and Dacapo
>>>    - JTREG tests
>>>    - Parallel Class Loading testlist
>>>    - GC, runtime and compiler testlists
>>>    - OOM and stress testlists
>>> - Running newly added JTREG test
>>>
>>> Thanks,
>>> Erik



More information about the hotspot-dev mailing list