RFR(S) 8204557: NMT: Linux os::committed_in_range() does not break out outer loop when contiguous region is found

Zhengyu Gu zgu at redhat.com
Thu Jun 7 19:53:39 UTC 2018


Hi Thomas,

Thanks for the review.

On 06/07/2018 02:51 PM, Thomas Stüfe wrote:
> Hi Zhengyu,
> 
> thank you for addressing this so quick!
> 
> The fix for (2) is nice and minimal.
> 
> ---
> 
> But about the second issue:
> 
> 3131     // During shutdown, some memory goes away without properly
> notifying NMT,
> 3132     // E.g. ConcurrentGCThread/WatcherThread can exit without
> deleting thread object.
> 3133     // Bailout and return as not committed for now.
> 3134     if (mincore_return_value == -1 && errno == ENOMEM) {
> 3135       return false;
> 3136     }
> 
> I still do not understand that part:
> 
> if testing one stripe of pages (1024 pages amount to 4MB) turns out to
> be completely uncommitted, mincore will return ENOMEM.
> See man mincore(2): "ENOMEM addr to addr + length contained unmapped memory."
> But why return false in that case, why not simply continue with the next stripe?
> 

I believe *unmapped* here, means the address space is invalid, not just 
without backing pages.

The memory tracked by NMT, should always be mapped/valid. If it 
encounters an unmapped/invalid address, that means it lost track 
somewhere (e.g. mentioned in comments)

> ---
> 
> May I suggest a further improvement, since you are touching the method anyway?
> 
> I am always a bit queasy about mincore and its implicit output array
> size. Very unlikely, but if our notion of page size should differ by
> some weird coincidence from the notion mincore has (which is
> sysconf(_SC_PAGESIZE)), we may get stack overwrites. I once had such a
> case on AIX.
> 
> So why not put a canary at the end:
> 
> unsigned char vec[stripe + 1];
> vec[stripe] = 'X';
> 
> ..
> mincore
> ..
> assert(vec[stripe] == 'X', "canary died");

Hmm, likely it will crash:-( I will add the assertion.

Updated webrev: http://cr.openjdk.java.net/~zgu/8204557/webrev.01/

Thanks,

-Zhengyu


> 
> ----
> 
> Best Regards, Thomas
> 
> 
> 
> On Thu, Jun 7, 2018 at 7:37 PM, Zhengyu Gu <zgu at redhat.com> wrote:
>> Thomas found this bug while in discussion of JDK-8202772 [1]
>>
>> When contiguous committed region is found, it only breaks out inner loop.
>> Instead of returning the found range, it continues scanning the rest to the
>> end.
>>
>> As the result, it squeezes all committed regions into one, even there are
>> holes.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8204557
>> Webrev: http://cr.openjdk.java.net/~zgu/8204557/webrev.00/
>>
>> Test:
>>
>>    hotspot_nmt and gtest on Linux x64 (fastdebug and release)
>>    Submit test also came back clean.
>>
>> [1]
>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2018-June/028419.html
>>
>>
>> Thanks,
>>
>> -Zhengyu


More information about the hotspot-runtime-dev mailing list