RFR(S) 8204557: NMT: Linux os::committed_in_range() does not break out outer loop when contiguous region is found

Thomas Stüfe thomas.stuefe at gmail.com
Thu Jun 7 20:05:39 UTC 2018


On Thu, Jun 7, 2018 at 9:53 PM, Zhengyu Gu <zgu at redhat.com> wrote:
> Hi Thomas,
>
> Thanks for the review.
>
> On 06/07/2018 02:51 PM, Thomas Stüfe wrote:
>>
>> Hi Zhengyu,
>>
>> thank you for addressing this so quick!
>>
>> The fix for (2) is nice and minimal.
>>
>> ---
>>
>> But about the second issue:
>>
>> 3131     // During shutdown, some memory goes away without properly
>> notifying NMT,
>> 3132     // E.g. ConcurrentGCThread/WatcherThread can exit without
>> deleting thread object.
>> 3133     // Bailout and return as not committed for now.
>> 3134     if (mincore_return_value == -1 && errno == ENOMEM) {
>> 3135       return false;
>> 3136     }
>>
>> I still do not understand that part:
>>
>> if testing one stripe of pages (1024 pages amount to 4MB) turns out to
>> be completely uncommitted, mincore will return ENOMEM.
>> See man mincore(2): "ENOMEM addr to addr + length contained unmapped
>> memory."
>> But why return false in that case, why not simply continue with the next
>> stripe?
>>
>
> I believe *unmapped* here, means the address space is invalid, not just
> without backing pages.
>
> The memory tracked by NMT, should always be mapped/valid. If it encounters
> an unmapped/invalid address, that means it lost track somewhere (e.g.
> mentioned in comments)
>

Oh, you are right! Sorry for the noise, I misunderstood the man page completely.

>> ---
>>
>> May I suggest a further improvement, since you are touching the method
>> anyway?
>>
>> I am always a bit queasy about mincore and its implicit output array
>> size. Very unlikely, but if our notion of page size should differ by
>> some weird coincidence from the notion mincore has (which is
>> sysconf(_SC_PAGESIZE)), we may get stack overwrites. I once had such a
>> case on AIX.
>>
>> So why not put a canary at the end:
>>
>> unsigned char vec[stripe + 1];
>> vec[stripe] = 'X';
>>
>> ..
>> mincore
>> ..
>> assert(vec[stripe] == 'X', "canary died");
>
>
> Hmm, likely it will crash:-( I will add the assertion.
>
> Updated webrev: http://cr.openjdk.java.net/~zgu/8204557/webrev.01/
>
> Thanks,
>
> -Zhengyu
>

This looks fine now. Thank you for fixing.

..Thomas

>
>
>>
>> ----
>>
>> Best Regards, Thomas
>>
>>
>>
>> On Thu, Jun 7, 2018 at 7:37 PM, Zhengyu Gu <zgu at redhat.com> wrote:
>>>
>>> Thomas found this bug while in discussion of JDK-8202772 [1]
>>>
>>> When contiguous committed region is found, it only breaks out inner loop.
>>> Instead of returning the found range, it continues scanning the rest to
>>> the
>>> end.
>>>
>>> As the result, it squeezes all committed regions into one, even there are
>>> holes.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8204557
>>> Webrev: http://cr.openjdk.java.net/~zgu/8204557/webrev.00/
>>>
>>> Test:
>>>
>>>    hotspot_nmt and gtest on Linux x64 (fastdebug and release)
>>>    Submit test also came back clean.
>>>
>>> [1]
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-June/028419.html
>>>
>>>
>>> Thanks,
>>>
>>> -Zhengyu


More information about the hotspot-runtime-dev mailing list