RFR(XS) 8191369: NMT: Enhance thread stack tracking

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Mon Feb 5 18:03:36 UTC 2018


Hi Zhengyu,

http://cr.openjdk.java.net/~zgu/8191369/webrev.02/src/hotspot/share/services/virtualMemoryTracker.cpp.udiff.html

      if (rgn->overlap_region(addr, size)) {
- // Clear a space for this region in the case it overlaps with any regions.
- remove_uncommitted_region(addr, size);
- break; // commit below
+ if (addr < rgn->base()) {
+ rgn->expand_region(addr, (rgn->base() - addr));
+ }
+
+ if (addr + size > rgn->base() + rgn->size()) {
+ rgn->expand_region(rgn->base() + rgn->size(),
+ (addr + size) - (rgn->base() + rgn->size()));
      }
+ return true;
+ }
+


The code to add the overlapping region does not look correct.  It 
doesn't add subtract the overlapping part and add in the part that's 
extended into the count of how big the section is.   Stefan has a 
pre-review for another fix that I'm looking at now to merge regions.  I 
think his design is simpler.   It adds the section, by deleting any 
overlapping memory, then has a loop to potentially merge sections.  The 
linked list of committed regions isn't generally very long.

Instead of adding a committed region for thread reserved region, since 
you call

+ size_t committed_size = os::committed_stack_size(stack_bottom, 
stack_size);


Each time, why not remove the committed region rather than trying to 
merge any additional committed area to it, ie:

remove_committed_region, call os::commited_stack_size(stack base, 
stack_size), add the committed region returned for printing.

You expect the committed region to be contiguous, don't you?

I can't comment on the os parts but they look legit.  If the os_linux. 
pd_committed_stack_size calls code that is available on the other posix 
platforms, can you move it there?  Can you call it 
committed_stack_size() and not have the pd_* layer as well?

Thanks,
Coleen

On 1/11/18 4:05 PM, Zhengyu Gu wrote:
> Hi,
>
> Can I get second (R)eviewer for this? and I also need a sponsor.
>
> This is JDK11 bug.
>
> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.02/
> Bug: Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>
> Thanks,
>
> -Zhengyu
>
>
>
>
> On 11/20/2017 01:46 PM, Zhengyu Gu wrote:
>> Hi Andrew,
>>
>> Thanks for the review.
>>
>>>> I'm not sure about the logic of the test when you hit the the
>>>> ENOMEM/uncommited case.
>>>>
>>>> +    mincore_return_value = mincore(test_addr, page_sz, vec);
>>>> +
>>>> +    if (mincore_return_value == -1 || (committed_only && (vec[0] &
>>>> 0x01) == 0)) {
>>>> +      // Page is not mapped/committed go up
>>>> +      // to find first mapped/committed page
>>>> +      if (errno != EAGAIN || (committed_only && (vec[0] & 0x01) == 
>>>> 0)) {
>>>> +        assert(mincore_return_value != -1 || errno == ENOMEM,
>>>> "Unexpected mincore errno");
>>>>
>>>> Should that not be
>>>>
>>>>    +      if ((mincore_return_value != -1 && errno != EAGAIN) ||
>>>> (committed_only && (vec[0] & 0x01) == 0)) {
>>>
>>> Sorry, that correction should have been:
>>>
>>>    +      if ((mincore_return_value == -1 && errno != EAGAIN) ||
>>>    (committed_only && (vec[0] & 0x01) == 0)) {
>>
>> Thanks for pointing out. Updated accordingly:
>>
>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.02/
>>
>> -Zhengyu
>>
>>>
>>>> A successful call to mincore is not guaranteed to reset errno and it
>>>> might by chance already have value EAGAIN before the call to 
>>>> mincore. If
>>>> we happened to hit a mapped, committed page first time then the vec 
>>>> bit
>>>> will be 1 and the code will loop and retest the same address. Unlikely
>>>> but possible?
>>>>
>>>>
>>>> The adjustment to the edge case logic now looks correct.
>>>>
>>>>> The test case for verifying fixes for above two issues can be found
>>>>> here: http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c
>>>>
>>>> Yes, that verifies the edge case handling e.g. I ran it with 256 pages
>>>> and with 127/8/9 mapped and it gave the right results.
>>>>
>>>>> As mentioned in early email, new patch also contains 
>>>>> implementation for
>>>>> Windows.
>>>>>
>>>>> Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/
>>>>>
>>>>> Test:
>>>>>
>>>>>    Reran hotspot_tier1 tests on Windows x64 and Linux x64 
>>>>> (fastdebug and
>>>>> release)
>>>> I have no idea about the validity of the Windows fix but the Linux one
>>>> looks good modulo that if condition.
>>>>
>>>> regards,
>>>>
>>>>
>>>> Andrew Dinn
>>>> -----------
>>>> Senior Principal Software Engineer
>>>> Red Hat UK Ltd
>>>> Registered in England and Wales under Company Registration No. 
>>>> 03798903
>>>> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>>>>
>>>



More information about the hotspot-runtime-dev mailing list