PING: RFR: 8217432: MetaspaceGC::_capacity_until_GC exceeds MaxMetaspaceSize

Yasumasa Suenaga yasuenag at gmail.com
Wed Jan 30 12:49:25 UTC 2019


Hi Thomas,

>>     http://cr.openjdk.java.net/~ysuenaga/JDK-8217432/webrev.01/
>>
>> I write reply based on it as below.
> 
>    internal testing shows no issues with the patch.

Thanks!


>> As I said before, I added assert() to inc_capacity_until_GC().
>> It will fail without other change in this webrev. So I think we can
>> evaluate it via
>> vmTestbase/metaspace/shrink_grow/ShrinkGrowTest/ShrinkGrowTest.java .
> 
> Okay, I guess that's enough. Please add a @bugid with this CR number to
> the test then.

I will add it.


>> I added the code to check total amount of _capacity_until_GC
>> before inc_capacity_until_GC() call.
>> But I concern the check and increment are not atomically. Is it okay?
> 
> If the check were located after the Atomic::cmpxchg() and after the
> check whether it had been this thread to change the value it would be
> fine :)

Do you suggest as following code?

------------------
@@ -1478,6 +1482,12 @@
    // have incremented the HWM and therefore the allocation might still succeed.
    do {
      incremented = MetaspaceGC::inc_capacity_until_GC(delta_bytes, &after, &before);
+    // Check after the increment that we did not go over the maximum.
+    // We can not do this earlier due to potential races.
+    if (!incremented && (after > MaxMetaspaceSize)) {
+      MetaspaceGC::dec_capacity_until_GC(delta_bytes);
+      return false;
+    }
      res = allocate(word_size, mdtype);
    } while (!incremented && res == NULL);

------------------

If so, assert() in inc_capacity_until_GC() might be failed
because _capacity_until_GC + delta_bytes might exceed MaxMetaspaceSize.

If we can change arguments of inc_capacity_until_GC(), we can pass
expected value to it (_capacity_until_GC + delta_bytes).
If so, we can compliant MaxMetaspaceSize via cmpxchg.


Thanks,

Yasumasa


On 2019/01/30 20:04, Thomas Schatzl wrote:
> Hi Yasumasa,
> 
> On Tue, 2019-01-29 at 21:39 +0900, Yasumasa Suenaga wrote:
>> Thank you for your comment.
>> I uploaded new webrev (it passed all tests on submit repo):
>>
>>     http://cr.openjdk.java.net/~ysuenaga/JDK-8217432/webrev.01/
>>
>> I write reply based on it as below.
> 
>    internal testing shows no issues with the patch.
> 
>> On 2019/01/28 22:55, Thomas Schatzl wrote:
>>> Hi Yasumasa,
>>>
>>> On Fri, 2019-01-25 at 11:35 +0900, Yasumasa Suenaga wrote:
>>>> PING: Could you review this?
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2019/01/21 17:52, Yasumasa Suenaga wrote:
>>>>> Hi all,
>>>>>
>>>>> I filed this issue to JBS:
>>>>>      https://bugs.openjdk.java.net/browse/JDK-8217432
>>>>>
>>>>> I investigated it, and I found that `minimum_desired_capacity`
>>>>> and `maximum_desired_capacity` in
>>>>> `MetaspaceGC::compute_new_size()`
>>>>> might exceed MaxMetaspaceSize.
>>>>> They shouldn't exceed MaxMetaspaceSize.
>>>>>
>>>>> I uploaded webrev for this bug. Could you review it?
>>>>>      http://cr.openjdk.java.net/~ysuenaga/JDK-8217432/webrev.00/
>>>>>
>>>>> This change has passed all tests on submit repo,
>>>>> :vmTestbase_vm_metaspace,
>>>>> and :vmTestbase_nsk_sysdict jtreg tests.
>>>
>>> The suggested change does not work - you can't return false from
>>> MetaspaceGC::inc_capacity_until_gc() because the return value
>>> indicates whether it was possible to change the value, not whether
>>> the value passed sense.
>>>
>>> The code before that must make sure that the resulting value is
>>> within limits (that could be a good assert in
>>> inc_capacity_until_GC()) from my understanding, without being an
>>> expert here.
>>>
>>> So with the current change, the assert in metaspace.cpp:260 would
>>> simply fire in this case.
>>
>> I added assert() to inc_capacity_until_GC() because
>> _capacity_until_GC shouldn't exceed MaxMetaspaceSize.
>> This function will not return false when _capacity_until_GC attempt
>> to be exceeded MaxMetaspaceSize.
>>
>>
>>> Making sure that the HWM does not go over MaxMetaspaceSize could be
>>> achieved by limiting expand_bytes passed to the method; I *think*,
>>> without following where minimum_desired_capacity is also used in
>>> that method, clamping it via
>>>
>>> 241:   minimum_desired_capacity =
>>> MIN2(MAX2(minimum_desired_capacity,
>>> MetaspaceSize), MaxMetaspaceSize);
>>>
>>> would achieve the desired effect.
>>
>> Your suggestion seems to be equivalent to my change.
> 
> Okay :)
> 
>> -------------
>>      const double min_tmp = used_after_gc / maximum_used_percentage;
>>      size_t minimum_desired_capacity =
>> -    (size_t)MIN2(min_tmp, double(max_uintx));
>> +    (size_t)MIN2(min_tmp, double(MaxMetaspaceSize));
>>      // Don't shrink less than the initial generation size
>>      minimum_desired_capacity = MAX2(minimum_desired_capacity,
>>                                      MetaspaceSize);
>> -------------
>>
>> I don't understand why current implementation use `max_uintx` at
>> this.
>> I guess it means "max value of Metaspace", so I think it should use
>> MaxMetaspaceSize rather than max_uintx.
> 
> I agree.
> 
>>> Could you also provide a test case for this change (i.e. failing
>>> before, good after) as Aleksey suggested?
>>
>> It is hard to reproduce this problem because this issue is caused by
>> C++ member variable (MetaspaceGC::_capacity_until_GC).
>>
>> It can be tested with WhiteBox test (incMetaspaceCapacityUntilGC()),
>> however it tests inc_capacity_until_GC() call only.
>>
>> OTOH we might test as Aleksey said, but we can't estimate precise
>> commit memory size of Metaspace, and evaluate why OOME is occurred.
>> (memory consumption or invalid _capacity_until_GC)
>>
>> As I said before, I added assert() to inc_capacity_until_GC().
>> It will fail without other change in this webrev. So I think we can
>> evaluate it via
>> vmTestbase/metaspace/shrink_grow/ShrinkGrowTest/ShrinkGrowTest.java .
> 
> Okay, I guess that's enough. Please add a @bugid with this CR number to
> the test then.
> 
>>> Note that my suggestion only fixes the call to
>>> inc_capacity_until_gc() in MetaspaceGC::compute_new_size(). I very
>>> much believe it is required to add another guard in
>>> ClassLoaderMetaspace::expand_and_allocate().
>>>
>>> (Please add the assert in inc_capacity_until_gc() I mentioned
>>> above!).
>>
>> I added the code to check total amount of _capacity_until_GC
>> before inc_capacity_until_GC() call.
>> But I concern the check and increment are not atomically. Is it okay?
> 
> If the check were located after the Atomic::cmpxchg() and after the
> check whether it had been this thread to change the value it would be
> fine :)
> 
> Please add a comment like "Check after the increment that we did not go
> over the maximum. We can not do this earlier due to potential races"
> before it to explain its location.
> 
> Thanks,
>    Thomas
> 
> 



More information about the hotspot-gc-dev mailing list