RFR(XS) 8191369: NMT: Enhance thread stack tracking
Zhengyu Gu
zgu at redhat.com
Mon Mar 5 17:13:49 UTC 2018
Hi Thomas,
On 03/05/2018 11:21 AM, Thomas Stüfe wrote:
> Hi Zhengyu,
>
> about the os_windows.cpp part:
>
> Are you sure this works as intended? I was not able to test it yet. It
> may work but makes some assumptions: that the stack top you are
> calculating is actually the end of a heap region; that you can just pass
> any address as the first parameter to VirtualQuery (the documentation is
> a bit ambiguous here); even that all committed pages must be
> consecutive, which does not have to be the case (we had cases on HyperV
> where there were holes in the committed thread stack area).
>
> I probably would have reused the coding in os::current_stack_base(),
> because we know that works:
>
> - call VirtualQuery with an address on the stack to get
> MEMORY_BASIC_INFORMATION::AllocationBase. We could just use stack bottom
> here as input.
> - then walk all regions upward and sum up the sizes of the committed
> ones, until we hit stack top - which supposedly should coincide with the
> end of a heap region but who knows.
The stack range we have here, is obtained from os::current_stack_base()
and os::current_stack_size() during thread initialization, exactly the
way as you mentioned above, I don't see why we can not trust it here.
I don't have any experiment on HyperV. If I recall correctly, on
Windows, the stack has to be touched page by page, if you skip a page in
the middle, you get SEGV. This may be different under Hypervisor? so we
should walk whole range, instead of stopping at the first uncommitted
region?
Thanks,
-Zhengyu
>
> Best Regards, Thomas
>
>
>
>
> On Fri, Feb 23, 2018 at 5:52 PM, <coleen.phillimore at oracle.com
> <mailto:coleen.phillimore at oracle.com>> wrote:
>
>
> This looks good but I hope someone else can review the os_linux and
> os_windows parts.
>
> I will sponsor this once the reviews are complete.
>
> thanks,
> Coleen
>
>
> On 2/21/18 10:49 AM, Zhengyu Gu wrote:
>
> Hi,
>
> I rebased the patch after JDK-8196405[1]. Stefan's patch fixed
> virtual memory tracking issues I tried to address in early patch.
>
>
> In additions, I removed os::pd_committed_stack_size() methods as
> Coleen suggested. Also added new gtest.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8191369
> <https://bugs.openjdk.java.net/browse/JDK-8191369>
> Webrev: http://cr.openjdk.java.net/~zgu/8191369/webrev.03/
> <http://cr.openjdk.java.net/~zgu/8191369/webrev.03/>
>
> Test:
> hotspot_runtime, test-hotspot-gtest on Linux 64 with
> fastdebug and release builds.
>
>
> [1] http://cr.openjdk.java.net/~stefank/8196405/webrev.02/
> <http://cr.openjdk.java.net/~stefank/8196405/webrev.02/>
>
> Thanks,
>
> -Zhengyu
>
>
> On 02/12/2018 09:33 AM, Zhengyu Gu wrote:
>
>
>
> No, what you have is better. I dislike the delegation
> to the duplicate function signature in the pd_* layer.
> Can you make the os::committed_stack_size() functions in
> the platforms where this isn't supported just be empy
> functions?
>
> Okay.
>
>
> If there are platforms that don't have the ability to
> track committed regions of the mtThread stack, don't you
> still need the concept of all_committed?
>
> No, it exists for convenience. However, stack tracking code
> does need adjustment.
>
> Thanks,
>
> -Zhengyu
>
>
>
> thanks,
> Coleen
>
>
> Thanks,
>
> -Zhengyu
>
>
>
>
>
> Thanks,
> Coleen
>
> On 1/11/18 4:05 PM, Zhengyu Gu wrote:
>
> Hi,
>
> Can I get second (R)eviewer for this? and I
> also need a sponsor.
>
> This is JDK11 bug.
>
> Webrev:
> http://cr.openjdk.java.net/~zgu/8191369/webrev.02/
> <http://cr.openjdk.java.net/~zgu/8191369/webrev.02/>
> Bug: Bug:
> https://bugs.openjdk.java.net/browse/JDK-8191369
> <https://bugs.openjdk.java.net/browse/JDK-8191369>
>
> Thanks,
>
> -Zhengyu
>
>
>
>
> On 11/20/2017 01:46 PM, Zhengyu Gu wrote:
>
> Hi Andrew,
>
> Thanks for the review.
>
> I'm not sure about the logic of
> the test when you hit the the
> ENOMEM/uncommited case.
>
> + mincore_return_value =
> mincore(test_addr, page_sz, vec);
> +
> + if (mincore_return_value ==
> -1 || (committed_only && (vec[0] &
> 0x01) == 0)) {
> + // Page is not
> mapped/committed go up
> + // to find first
> mapped/committed page
> + if (errno != EAGAIN ||
> (committed_only && (vec[0] &
> 0x01) == 0)) {
> +
> assert(mincore_return_value !=
> -1 || errno == ENOMEM,
> "Unexpected mincore errno");
>
> Should that not be
>
> + if
> ((mincore_return_value != -1 &&
> errno != EAGAIN) ||
> (committed_only && (vec[0] &
> 0x01) == 0)) {
>
>
> Sorry, that correction should have been:
>
> + if ((mincore_return_value
> == -1 && errno != EAGAIN) ||
> (committed_only && (vec[0] &
> 0x01) == 0)) {
>
>
> Thanks for pointing out. Updated
> accordingly:
>
> Webrev:
> http://cr.openjdk.java.net/~zgu/8191369/webrev.02/
> <http://cr.openjdk.java.net/~zgu/8191369/webrev.02/>
>
> -Zhengyu
>
>
> A successful call to mincore is
> not guaranteed to reset errno and it
> might by chance already have
> value EAGAIN before the call to
> mincore. If
> we happened to hit a mapped,
> committed page first time then
> the vec bit
> will be 1 and the code will loop
> and retest the same address.
> Unlikely
> but possible?
>
>
> The adjustment to the edge case
> logic now looks correct.
>
> The test case for verifying
> fixes for above two issues
> can be found
> here:
> http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c
> <http://cr.openjdk.java.net/~zgu/8191369/test_mmap.c>
>
>
> Yes, that verifies the edge case
> handling e.g. I ran it with 256
> pages
> and with 127/8/9 mapped and it
> gave the right results.
>
> As mentioned in early email,
> new patch also contains
> implementation for
> Windows.
>
> Updated Webrev:
> http://cr.openjdk.java.net/~zgu/8191369/webrev.01/
> <http://cr.openjdk.java.net/~zgu/8191369/webrev.01/>
>
> Test:
>
> Reran hotspot_tier1
> tests on Windows x64 and
> Linux x64 (fastdebug and
> release)
>
> I have no idea about the
> validity of the Windows fix but
> the Linux one
> looks good modulo that if condition.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in England and Wales
> under Company Registration No.
> 03798903
> Directors: Michael Cunningham,
> Michael ("Mike") O'Neill, Eric
> Shander
>
>
>
>
>
>
More information about the hotspot-runtime-dev
mailing list