RFR(XS) 8191369: NMT: Enhance thread stack tracking

Thomas Stüfe thomas.stuefe at gmail.com
Mon Mar 5 16:21:30 UTC 2018


Hi Zhengyu,

about the os_windows.cpp part:

Are you sure this works as intended? I was not able to test it yet. It may
work but makes some assumptions: that the stack top you are calculating is
actually the end of a heap region; that you can just pass any address as
the first parameter to VirtualQuery (the documentation is a bit ambiguous
here); even that all committed pages must be consecutive, which does not
have to be the case (we had cases on HyperV where there were holes in the
committed thread stack area).

I probably would have reused the coding in  os::current_stack_base(),
because we know that works:

- call VirtualQuery with an address on the stack to get
MEMORY_BASIC_INFORMATION::AllocationBase. We could just use stack bottom
here as input.
- then walk all regions upward and sum up the sizes of the committed ones,
until we hit stack top - which supposedly should coincide with the end of a
heap region but who knows.

Best Regards, Thomas




On Fri, Feb 23, 2018 at 5:52 PM, <coleen.phillimore at oracle.com> wrote:

>
> This looks good but I hope someone else can review the os_linux and
> os_windows parts.
>
> I will sponsor this once the reviews are complete.
>
> thanks,
> Coleen
>
>
> On 2/21/18 10:49 AM, Zhengyu Gu wrote:
>
>> Hi,
>>
>> I rebased the patch after JDK-8196405[1]. Stefan's patch fixed virtual
>> memory tracking issues I tried to address in early patch.
>>
>>
>> In additions, I removed os::pd_committed_stack_size() methods as Coleen
>> suggested. Also added new gtest.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.03/
>>
>> Test:
>>   hotspot_runtime, test-hotspot-gtest on Linux 64 with fastdebug and
>> release builds.
>>
>>
>> [1] http://cr.openjdk.java.net/~stefank/8196405/webrev.02/
>>
>> Thanks,
>>
>> -Zhengyu
>>
>>
>> On 02/12/2018 09:33 AM, Zhengyu Gu wrote:
>>
>>>
>>>
>>>> No, what you have is better.   I dislike the delegation to the
>>>> duplicate function signature in the pd_* layer.   Can you make the
>>>> os::committed_stack_size() functions in the platforms where this isn't
>>>> supported just be empy functions?
>>>>
>>> Okay.
>>>
>>>
>>>> If there are platforms that don't have the ability to track committed
>>>> regions of the mtThread stack, don't you still need the concept of
>>>> all_committed?
>>>>
>>> No, it exists for convenience. However, stack tracking code does need
>>> adjustment.
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>>
>>>
>>>
>>>> thanks,
>>>> Coleen
>>>>
>>>>
>>>>> Thanks,
>>>>>
>>>>> -Zhengyu
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 1/11/18 4:05 PM, Zhengyu Gu wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Can I get second (R)eviewer for this? and I also need a sponsor.
>>>>>>>
>>>>>>> This is JDK11 bug.
>>>>>>>
>>>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.02/
>>>>>>> Bug: Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> -Zhengyu
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 11/20/2017 01:46 PM, Zhengyu Gu wrote:
>>>>>>>
>>>>>>>> Hi Andrew,
>>>>>>>>
>>>>>>>> Thanks for the review.
>>>>>>>>
>>>>>>>> I'm not sure about the logic of the test when you hit the the
>>>>>>>>>> ENOMEM/uncommited case.
>>>>>>>>>>
>>>>>>>>>> +    mincore_return_value = mincore(test_addr, page_sz, vec);
>>>>>>>>>> +
>>>>>>>>>> +    if (mincore_return_value == -1 || (committed_only && (vec[0]
>>>>>>>>>> &
>>>>>>>>>> 0x01) == 0)) {
>>>>>>>>>> +      // Page is not mapped/committed go up
>>>>>>>>>> +      // to find first mapped/committed page
>>>>>>>>>> +      if (errno != EAGAIN || (committed_only && (vec[0] & 0x01)
>>>>>>>>>> == 0)) {
>>>>>>>>>> +        assert(mincore_return_value != -1 || errno == ENOMEM,
>>>>>>>>>> "Unexpected mincore errno");
>>>>>>>>>>
>>>>>>>>>> Should that not be
>>>>>>>>>>
>>>>>>>>>>    +      if ((mincore_return_value != -1 && errno != EAGAIN) ||
>>>>>>>>>> (committed_only && (vec[0] & 0x01) == 0)) {
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sorry, that correction should have been:
>>>>>>>>>
>>>>>>>>>    +      if ((mincore_return_value == -1 && errno != EAGAIN) ||
>>>>>>>>>    (committed_only && (vec[0] & 0x01) == 0)) {
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for pointing out. Updated accordingly:
>>>>>>>>
>>>>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.02/
>>>>>>>>
>>>>>>>> -Zhengyu
>>>>>>>>
>>>>>>>>
>>>>>>>>> A successful call to mincore is not guaranteed to reset errno and
>>>>>>>>>> it
>>>>>>>>>> might by chance already have value EAGAIN before the call to
>>>>>>>>>> mincore. If
>>>>>>>>>> we happened to hit a mapped, committed page first time then the
>>>>>>>>>> vec bit
>>>>>>>>>> will be 1 and the code will loop and retest the same address.
>>>>>>>>>> Unlikely
>>>>>>>>>> but possible?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The adjustment to the edge case logic now looks correct.
>>>>>>>>>>
>>>>>>>>>> The test case for verifying fixes for above two issues can be
>>>>>>>>>>> found
>>>>>>>>>>> here: http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, that verifies the edge case handling e.g. I ran it with 256
>>>>>>>>>> pages
>>>>>>>>>> and with 127/8/9 mapped and it gave the right results.
>>>>>>>>>>
>>>>>>>>>> As mentioned in early email, new patch also contains
>>>>>>>>>>> implementation for
>>>>>>>>>>> Windows.
>>>>>>>>>>>
>>>>>>>>>>> Updated Webrev: http://cr.openjdk.java.net/~zg
>>>>>>>>>>> u/8191369/webrev.01/
>>>>>>>>>>>
>>>>>>>>>>> Test:
>>>>>>>>>>>
>>>>>>>>>>>    Reran hotspot_tier1 tests on Windows x64 and Linux x64
>>>>>>>>>>> (fastdebug and
>>>>>>>>>>> release)
>>>>>>>>>>>
>>>>>>>>>> I have no idea about the validity of the Windows fix but the
>>>>>>>>>> Linux one
>>>>>>>>>> looks good modulo that if condition.
>>>>>>>>>>
>>>>>>>>>> regards,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Andrew Dinn
>>>>>>>>>> -----------
>>>>>>>>>> Senior Principal Software Engineer
>>>>>>>>>> Red Hat UK Ltd
>>>>>>>>>> Registered in England and Wales under Company Registration No.
>>>>>>>>>> 03798903
>>>>>>>>>> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric
>>>>>>>>>> Shander
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>
>


More information about the hotspot-runtime-dev mailing list