RFR(S) 8204557: NMT: Linux os::committed_in_range() does not break out outer loop when contiguous region is found
Zhengyu Gu
zgu at redhat.com
Thu Jun 7 19:53:39 UTC 2018
Hi Thomas,
Thanks for the review.
On 06/07/2018 02:51 PM, Thomas Stüfe wrote:
> Hi Zhengyu,
>
> thank you for addressing this so quick!
>
> The fix for (2) is nice and minimal.
>
> ---
>
> But about the second issue:
>
> 3131 // During shutdown, some memory goes away without properly
> notifying NMT,
> 3132 // E.g. ConcurrentGCThread/WatcherThread can exit without
> deleting thread object.
> 3133 // Bailout and return as not committed for now.
> 3134 if (mincore_return_value == -1 && errno == ENOMEM) {
> 3135 return false;
> 3136 }
>
> I still do not understand that part:
>
> if testing one stripe of pages (1024 pages amount to 4MB) turns out to
> be completely uncommitted, mincore will return ENOMEM.
> See man mincore(2): "ENOMEM addr to addr + length contained unmapped memory."
> But why return false in that case, why not simply continue with the next stripe?
>
I believe *unmapped* here, means the address space is invalid, not just
without backing pages.
The memory tracked by NMT, should always be mapped/valid. If it
encounters an unmapped/invalid address, that means it lost track
somewhere (e.g. mentioned in comments)
> ---
>
> May I suggest a further improvement, since you are touching the method anyway?
>
> I am always a bit queasy about mincore and its implicit output array
> size. Very unlikely, but if our notion of page size should differ by
> some weird coincidence from the notion mincore has (which is
> sysconf(_SC_PAGESIZE)), we may get stack overwrites. I once had such a
> case on AIX.
>
> So why not put a canary at the end:
>
> unsigned char vec[stripe + 1];
> vec[stripe] = 'X';
>
> ..
> mincore
> ..
> assert(vec[stripe] == 'X', "canary died");
Hmm, likely it will crash:-( I will add the assertion.
Updated webrev: http://cr.openjdk.java.net/~zgu/8204557/webrev.01/
Thanks,
-Zhengyu
>
> ----
>
> Best Regards, Thomas
>
>
>
> On Thu, Jun 7, 2018 at 7:37 PM, Zhengyu Gu <zgu at redhat.com> wrote:
>> Thomas found this bug while in discussion of JDK-8202772 [1]
>>
>> When contiguous committed region is found, it only breaks out inner loop.
>> Instead of returning the found range, it continues scanning the rest to the
>> end.
>>
>> As the result, it squeezes all committed regions into one, even there are
>> holes.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8204557
>> Webrev: http://cr.openjdk.java.net/~zgu/8204557/webrev.00/
>>
>> Test:
>>
>> hotspot_nmt and gtest on Linux x64 (fastdebug and release)
>> Submit test also came back clean.
>>
>> [1]
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-June/028419.html
>>
>>
>> Thanks,
>>
>> -Zhengyu
More information about the hotspot-runtime-dev
mailing list