RFR(XS) 8191369: NMT: Enhance thread stack tracking

Wed Feb 7 13:08:19 UTC 2018

Hi Zhengyu,

(All, sorry for hi-jacking this thread)

I haven't followed this review thread, but I see that Coleen is 
referring to my patch, so for completeness here it is (I consider it a 
prototype and it's not ready to be committed):

http://cr.openjdk.java.net/~stefank/8196217/redo-prototype.01/

The patch tries to fix multiple bugs in the code above. It removes parts 
of previously registered regions that overlap with the current region to 
be committed. It then tries to expand and merge with surrounding 
regions, or add a completely new region if there are no adjacent regions.

I temporarily changed the following part to exercise the code that deals 
with overlapping commits:

      if (node != NULL) {
-      node->data()->set_all_committed(all_committed);
+      // TEMPORARY: Turn off all_committed feature to provoke more 
splitting and merging.
+      node->data()->set_all_committed(false /*all_committed*/);

I'm not sure it's a good idea to have two ways to register committed 
regions. I think it would be better to have only one way, and that we 
make sure that that code is stable. As the current code is written it is 
possible to fully commit a region, than commit an overlapping region 
with another stack trace, without that being registered. See:

   85   // TODO: What if stack is different from the reserved regions stack?
   86   // TODO: Maybe assert that we don't call add_committed_region on 
all_committed regions.
   87   if (all_committed()) {
   88     return true;
   89   }

It think this is a pre-existing bug:
-          CommittedMemoryRegion high_rgn(high_base, high_size, 
NativeCallStack::EMPTY_STACK);
+          CommittedMemoryRegion high_rgn(high_base, high_size, 
*call_stack());
            return add_committed_region(high_rgn);

I've temporarily added some guarantees, because I think you can end up 
with empty regions, and when/if that happens using "end - 1" is broken:

+        guarantee(sz > 0, "Sanity check that end - 1 isn't out-of-bounds");

There's one part of the code that I'd like to ask you:

  bool ReservedMemoryRegion::remove_uncommitted_region(address addr, 
size_t sz) {
-  // uncommit stack guard pages
-  if (flag() == mtThreadStack && !same_region(addr, sz)) {
-    return true;
-  }
-

Why do we have this special handling of the mtThreadStack region? I 
think it would be cleaner if ReservedMemoryRegion didn't have 
specialized code depending on the mtFlag. If we remove the section 
above, I can make add_committed_region work with the patch above, but 
with the code in place we would/could end up with double accounting and 
bugs.

I've also started to write a C++ gtest to exercise this code path, 
because the current WhiteBox testing API doesn't have support to change 
the call stack of committed regions.

A side note, there are a few bugs in code using NMT that I need to write 
bugs for:

1) os::attempt_reserve_memory_at incorrectly calls 
MemTracker::record_virtual_memory_reserve_and_commit instead of 
MemTracker::record_virtual_memory_reserve when file_desc == -1.

2) os::attempt_reserve_memory_at calls pd_attempt_reserve_memory_at and 
then MemTracker::record_virtual_memory_reserve_and_commit. But, on 
windows pd_attempt_reserve_memory calls os::reserve_memory, which calls 
    MemTracker::record_virtual_memory_reserve. This has the effect that 
the all_committed() property is set to false on Windows, but true on Linux.

Thanks,
StefanK

On 2018-02-06 19:17, Zhengyu Gu wrote:
> Hi Coleen,
> 
> On 02/05/2018 01:03 PM, coleen.phillimore at oracle.com wrote:
>>
>> Hi Zhengyu,
>>
>> http://cr.openjdk.java.net/~zgu/8191369/webrev.02/src/hotspot/share/services/virtualMemoryTracker.cpp.udiff.html 
>>
>>
>>       if (rgn->overlap_region(addr, size)) {
>> - // Clear a space for this region in the case it overlaps with any 
>> regions.
>> - remove_uncommitted_region(addr, size);
>> - break; // commit below
>> + if (addr < rgn->base()) {
>> + rgn->expand_region(addr, (rgn->base() - addr));
>> + }
>> +
>> + if (addr + size > rgn->base() + rgn->size()) {
>> + rgn->expand_region(rgn->base() + rgn->size(),
>> + (addr + size) - (rgn->base() + rgn->size()));
>>       }
>> + return true;
>> + }
>> +
>>
>>
>> The code to add the overlapping region does not look correct.  It 
>> doesn't add subtract the overlapping part and add in the part that's 
>> extended into the count of how big the section is.   Stefan has a 
>> pre-review for another fix that I'm looking at now to merge regions.  
>> I think his design is simpler.   It adds the section, by deleting any 
>> overlapping memory, then has a loop to potentially merge sections.  
>> The linked list of committed regions isn't generally very long.
>>
> Yes, I failed to adjust summary amount here, thanks for catching this!
> 
> 
>> Instead of adding a committed region for thread reserved region, since 
>> you call
>>
>> + size_t committed_size = os::committed_stack_size(stack_bottom, 
>> stack_size);
>>
>>
>> Each time, why not remove the committed region rather than trying to 
>> merge any additional committed area to it, ie:
>>
>> remove_committed_region, call os::commited_stack_size(stack base, 
>> stack_size), add the committed region returned for printing.
> No, I can not, cause I don't know previous stack size to uncommit it.
> 
> Stefan's patch touched similar area, but not the same. I think we have 
> some corner cases that are not handled properly. ex. once region is 
> expanded, it can become adjacent region to next one, etc.
> 
>>
>> You expect the committed region to be contiguous, don't you?
> 
> Yes, for this case and most of cases at the time this code was written. 
> But not anymore, Shenandoah can uncommit arbitrary regions and recommit 
> them in arbitrary orders.
> 
> I can hold off this patch till Stefan's patch get in, if it will happen 
> soon.
> 
>>
>> I can't comment on the os parts but they look legit.  If the os_linux. 
>> pd_committed_stack_size calls code that is available on the other 
>> posix platforms, can you move it there?  Can you call it 
>> committed_stack_size() and not have the pd_* layer as well?
> 
> Linux pd_committed_stack_size() uses mincore(), and is Linux only API. 
> There are only Linux and Windows implementation with this patch.
> 
> so, you want something like this?
> 
> #if !defined(LINUX) && !defined(_WINDOWS)
> size_t os::committed_stack_size(address bottom, size_t size) {
>    return size;
> }
> #endif
> 
> and leave Linux/Windows' implementation under os directory?
> 
> Thanks,
> 
> -Zhengyu
> 
> 
> 
> 
>>
>> Thanks,
>> Coleen
>>
>> On 1/11/18 4:05 PM, Zhengyu Gu wrote:
>>> Hi,
>>>
>>> Can I get second (R)eviewer for this? and I also need a sponsor.
>>>
>>> This is JDK11 bug.
>>>
>>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.02/
>>> Bug: Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>>
>>>
>>>
>>>
>>> On 11/20/2017 01:46 PM, Zhengyu Gu wrote:
>>>> Hi Andrew,
>>>>
>>>> Thanks for the review.
>>>>
>>>>>> I'm not sure about the logic of the test when you hit the the
>>>>>> ENOMEM/uncommited case.
>>>>>>
>>>>>> +    mincore_return_value = mincore(test_addr, page_sz, vec);
>>>>>> +
>>>>>> +    if (mincore_return_value == -1 || (committed_only && (vec[0] &
>>>>>> 0x01) == 0)) {
>>>>>> +      // Page is not mapped/committed go up
>>>>>> +      // to find first mapped/committed page
>>>>>> +      if (errno != EAGAIN || (committed_only && (vec[0] & 0x01) 
>>>>>> == 0)) {
>>>>>> +        assert(mincore_return_value != -1 || errno == ENOMEM,
>>>>>> "Unexpected mincore errno");
>>>>>>
>>>>>> Should that not be
>>>>>>
>>>>>>    +      if ((mincore_return_value != -1 && errno != EAGAIN) ||
>>>>>> (committed_only && (vec[0] & 0x01) == 0)) {
>>>>>
>>>>> Sorry, that correction should have been:
>>>>>
>>>>>    +      if ((mincore_return_value == -1 && errno != EAGAIN) ||
>>>>>    (committed_only && (vec[0] & 0x01) == 0)) {
>>>>
>>>> Thanks for pointing out. Updated accordingly:
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.02/
>>>>
>>>> -Zhengyu
>>>>
>>>>>
>>>>>> A successful call to mincore is not guaranteed to reset errno and it
>>>>>> might by chance already have value EAGAIN before the call to 
>>>>>> mincore. If
>>>>>> we happened to hit a mapped, committed page first time then the 
>>>>>> vec bit
>>>>>> will be 1 and the code will loop and retest the same address. 
>>>>>> Unlikely
>>>>>> but possible?
>>>>>>
>>>>>>
>>>>>> The adjustment to the edge case logic now looks correct.
>>>>>>
>>>>>>> The test case for verifying fixes for above two issues can be found
>>>>>>> here: http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c
>>>>>>
>>>>>> Yes, that verifies the edge case handling e.g. I ran it with 256 
>>>>>> pages
>>>>>> and with 127/8/9 mapped and it gave the right results.
>>>>>>
>>>>>>> As mentioned in early email, new patch also contains 
>>>>>>> implementation for
>>>>>>> Windows.
>>>>>>>
>>>>>>> Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/
>>>>>>>
>>>>>>> Test:
>>>>>>>
>>>>>>>    Reran hotspot_tier1 tests on Windows x64 and Linux x64 
>>>>>>> (fastdebug and
>>>>>>> release)
>>>>>> I have no idea about the validity of the Windows fix but the Linux 
>>>>>> one
>>>>>> looks good modulo that if condition.
>>>>>>
>>>>>> regards,
>>>>>>
>>>>>>
>>>>>> Andrew Dinn
>>>>>> -----------
>>>>>> Senior Principal Software Engineer
>>>>>> Red Hat UK Ltd
>>>>>> Registered in England and Wales under Company Registration No. 
>>>>>> 03798903
>>>>>> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
>>>>>>
>>>>>
>>