RFR(XS) 8191369: NMT: Enhance thread stack tracking
Andrew Dinn
adinn at redhat.com
Mon Nov 20 16:58:28 UTC 2017
Hi Zhengyu,
On 17/11/17 20:43, Zhengyu Gu wrote:
> There were two bugs that resulted NMT report incorrect thread stacks.
>
> 1) NMT did not handle overlapped regions correctly.
>
> 2) There are two issues with Linux: get_stack_commited_bottom()
>
> * The method name is misleading, it does not return *committed* bottom,
> but *mapped* bottom, where address space is valid, not necessary committed.
> I renamed this function to get_stack_mapped_bottom() with option to
> return committed bottom with proper parameter.
>
> * The binary search algorithm mishandled edge cases.
Hmm, yes looking through the mincore man page again you are right that
you need to check the returned bit vector to see if a vmem page is
committed to a physical page.
I'm not sure about the logic of the test when you hit the the
ENOMEM/uncommited case.
+ mincore_return_value = mincore(test_addr, page_sz, vec);
+
+ if (mincore_return_value == -1 || (committed_only && (vec[0] &
0x01) == 0)) {
+ // Page is not mapped/committed go up
+ // to find first mapped/committed page
+ if (errno != EAGAIN || (committed_only && (vec[0] & 0x01) == 0)) {
+ assert(mincore_return_value != -1 || errno == ENOMEM,
"Unexpected mincore errno");
Should that not be
+ if ((mincore_return_value != -1 && errno != EAGAIN) ||
(committed_only && (vec[0] & 0x01) == 0)) {
A successful call to mincore is not guaranteed to reset errno and it
might by chance already have value EAGAIN before the call to mincore. If
we happened to hit a mapped, committed page first time then the vec bit
will be 1 and the code will loop and retest the same address. Unlikely
but possible?
The adjustment to the edge case logic now looks correct.
> The test case for verifying fixes for above two issues can be found
> here: http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c
Yes, that verifies the edge case handling e.g. I ran it with 256 pages
and with 127/8/9 mapped and it gave the right results.
> As mentioned in early email, new patch also contains implementation for
> Windows.
>
> Updated Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.01/
>
> Test:
>
> Reran hotspot_tier1 tests on Windows x64 and Linux x64 (fastdebug and
> release)
I have no idea about the validity of the Windows fix but the Linux one
looks good modulo that if condition.
regards,
Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander
More information about the hotspot-runtime-dev
mailing list