RFR(XS) 8191369: NMT: Enhance thread stack tracking

Andrew Dinn adinn at redhat.com
Mon Nov 20 17:08:38 UTC 2017


Oops, correcting my correction:

On 20/11/17 16:58, Andrew Dinn wrote:
> On 17/11/17 20:43, Zhengyu Gu wrote:
>> There were two bugs that resulted NMT report incorrect thread stacks.
>>
>> 1) NMT did not handle overlapped regions correctly.
>>
>> 2) There are two issues with Linux: get_stack_commited_bottom()
>>
>>  * The method name is misleading, it does not return *committed* bottom,
>> but *mapped* bottom, where address space is valid, not necessary committed.
>>    I renamed this function to get_stack_mapped_bottom() with option to
>> return committed bottom with proper parameter.
>>
>>  * The binary search algorithm mishandled edge cases.
> 
> Hmm, yes looking through the mincore man page again you are right that
> you need to check the returned bit vector to see if a vmem page is
> committed to a physical page.
> 
> I'm not sure about the logic of the test when you hit the the
> ENOMEM/uncommited case.
> 
> +    mincore_return_value = mincore(test_addr, page_sz, vec);
> +
> +    if (mincore_return_value == -1 || (committed_only && (vec[0] &
> 0x01) == 0)) {
> +      // Page is not mapped/committed go up
> +      // to find first mapped/committed page
> +      if (errno != EAGAIN || (committed_only && (vec[0] & 0x01) == 0)) {
> +        assert(mincore_return_value != -1 || errno == ENOMEM,
> "Unexpected mincore errno");
> 
> Should that not be
> 
>   +      if ((mincore_return_value != -1 && errno != EAGAIN) ||
> (committed_only && (vec[0] & 0x01) == 0)) {

Sorry, that correction should have been:

  +      if ((mincore_return_value == -1 && errno != EAGAIN) ||
  (committed_only && (vec[0] & 0x01) == 0)) {

> A successful call to mincore is not guaranteed to reset errno and it
> might by chance already have value EAGAIN before the call to mincore. If
> we happened to hit a mapped, committed page first time then the vec bit
> will be 1 and the code will loop and retest the same address. Unlikely
> but possible?
> 
> 
> The adjustment to the edge case logic now looks correct.
> 
>> The test case for verifying fixes for above two issues can be found
>> here: http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c
> 
> Yes, that verifies the edge case handling e.g. I ran it with 256 pages
> and with 127/8/9 mapped and it gave the right results.
> 
>> As mentioned in early email, new patch also contains implementation for
>> Windows.
>>
>> Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/
>>
>> Test:
>>
>>   Reran hotspot_tier1 tests on Windows x64 and Linux x64 (fastdebug and
>> release)
> I have no idea about the validity of the Windows fix but the Linux one
> looks good modulo that if condition.
> 
> regards,
> 
> 
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales under Company Registration No. 03798903
> Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
> 

-- 
regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander


More information about the hotspot-runtime-dev mailing list