RFR: 8226236: [TESTBUG] win32: gc/metaspace/TestCapacityUntilGCWrapAround.java fails
Thomas Stuefe
stuefe at openjdk.java.net
Wed Oct 14 10:46:11 UTC 2020
On Wed, 14 Oct 2020 09:31:13 GMT, Yasumasa Suenaga <ysuenaga at openjdk.org> wrote:
>> A minimal fix - which preserves the current behavior as much as possible - could be:
>>
>> --- a/src/hotspot/share/memory/metaspace.cpp
>> +++ b/src/hotspot/share/memory/metaspace.cpp
>> @@ -152,7 +152,7 @@ bool MetaspaceGC::inc_capacity_until_GC(size_t v, size_t* new_cap_until_GC, size
>>
>> if (new_value < old_capacity_until_GC) {
>> // The addition wrapped around, set new_value to aligned max value.
>> - new_value = align_down(max_uintx, Metaspace::commit_alignment());
>> + new_value = align_down(max_uintx, Metaspace::reserve_alignment());
>> }
>>
>> if (new_value > MaxMetaspaceSize) {
>>
>> This preserves the current behavior:
>> - increasing the gc threshold beyond MaxMetaspaceSize should be an error
>> - on 32bit, increasing the gc threshold beyond the scope of the 32bit counter should be tolerated and result in a value
>> capped at the end of 32bit; unless MaxMetaspaceSize is set and lower than max, in which case it would be an error.
>>
>> Note that I have no possibility to test this since the 32bit build on Windows still seems broken.
>
> Thanks @tstuefe for the fix! I will merge it to this PR.
>
>> (This must have been always a problem on Windows, and only recently on Linux, right?)
>
> I don't know how long this problem has been available, but I saw this on Linux x64 when I removed `vm.bits == 32`. It
> is caused by new_value exceeds MaxMetaspaceSize.
> I think we should be current behavior should be preserved. I added the logic which relates to `_capacity_until_gc` in
> [JDK-8217432](https://bugs.openjdk.java.net/browse/JDK-8217432). Before this fix, I saw OutOfMemoryError despite that
> the memory was still available. And also @shipilev seems to had seen [the problems with Metaspace on
> Epsilon](https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January/024595.html). BTW
> TestCapacityUntilGCWrapAround.java still should be run on 32bit platforms only? I think we should check whether
> `_capacity_until_gc` exceeds MaxMetaspaceSize on 64bit platforms.
Hi Yasumasa,
> Thanks @tstuefe for the fix! I will merge it to this PR.
>
No I think this is the fix. I still think the test is correct and does what it is supposed to do, see below.
> > (This must have been always a problem on Windows, and only recently on Linux, right?)
>
> I don't know how long this problem has been available, but I saw this on Linux x64 when I removed `vm.bits == 32`. It
> is caused by new_value exceeds MaxMetaspaceSize.
You see this on 64bit because the test, as it is coded now, is not meant to run on 64bit.
When running on 64bit first you will trigger this exception:
jlong max_size_t = (jlong) ((size_t) -1);
if (inc > max_size_t) {
THROW_MSG_0(vmSymbols::java_lang_IllegalArgumentException(),
err_msg("WB_IncMetaspaceCapacityUntilGC: inc does not fit in size_t: " JLONG_FORMAT, inc));
}
because it was written with 32bit in mind. Its purpose is to check if the inc value - itself a jlong - exceeds the
range for a 32bit unsigned. It uses a jlong - 64bit signed - as comparison variable and sets it to (size_t)-1, clearly
to get a 32bit SIZE_MAX. On 64bit it is oc a real -1, so the comparison will always be true.
Once you fix that or comment it out, on 64bit you will trigger this test assertion:
Asserts.assertLTE(after, MAX_UINT,
"Increasing with MAX_UINT should not cause value larger than MAX_UINT:" + after);
because we just increased the gc threshold beyond the 32bit MAX_UINT value coded in the test.
Granted, one could make this test cleaner and make it work on 64bit. But what for? This overflow is only an issue on
32bit. It theoretically could be an issue for 64bit too, if you call the increase function just very very often enough,
but I think that is a bit far fetched.
If you enable the test for 64 bit you have to make sure that you call incMetaspaceCapacityUntilGC() with either a high
enough increase or often enough to ensure that you hit the limit of a 64bit gc threshold counter. And then fix the
overflow handling. But I seriously think this is not a real world issue.
Alternatively one could #ifdef the whole whitebox function out for 64bit, just compile it on 32bit.
>
> I think we should be current behavior should be preserved. I added the logic which relates to `_capacity_until_gc` in
> [JDK-8217432](https://bugs.openjdk.java.net/browse/JDK-8217432). Before this fix, I saw OutOfMemoryError despite that
> the memory was still available. And also @shipilev seems to had seen [the problems with Metaspace on
> Epsilon](https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-January/024595.html). BTW
> TestCapacityUntilGCWrapAround.java still should be run on 32bit platforms only? I think we should check whether
> `_capacity_until_gc` exceeds MaxMetaspaceSize on 64bit platforms.
Look at it this way:
The tests are called with MaxMetaspaceSize unset, hence "infinite". Therefore the VM should *never* run into a
situation where it thinks that the GC threshold is larger than MaxMetaspaceSize. Regardless what input values we feed
into the increase function. The test feeds it very large values on purpose. Reducing the inc values defeats the purpose
of this test.
The error is caused because, technically, MaxMetaspaceSize="infinite" is in reality implemented by
MaxMetaspaceSize="very large value but a bit smaller than SIZE_MAX because of alignment". Which gives a small window,
in this case of 16K, for values to be larger than MaxMetaspaceSize.
And then the overflow handling is broken and causes, as part of its value correction, the gc threshold to fall into
this 16K window and be larger than MaxMetaspaceSize. And that exactly is the error.
The test really could be clearer and have comments and such, and the GC metaspace coding too. But the test is not wrong.
-------------
PR: https://git.openjdk.java.net/jdk/pull/628
More information about the hotspot-gc-dev
mailing list