RFR (preliminary): 8202772: "NMT erroneously assumes thread stack boundaries to be page aligned"

Thomas Stüfe thomas.stuefe at gmail.com
Thu Jun 7 14:16:22 UTC 2018

On Thu, Jun 7, 2018 at 2:42 PM, Zhengyu Gu <zgu at redhat.com> wrote:
> Hi Thomas,
> On 06/07/2018 06:49 AM, Thomas Stüfe wrote:
>> Hi all,
>> I could use some input/advice for:
>> https://bugs.openjdk.java.net/browse/JDK-8202772
>> This is my - yet incomplete - fix attempt:
>> http://cr.openjdk.java.net/~stuefe/webrevs/8202772-NMT-erroneously-assumes-thread-stack-boundaries-to-be-page-aligned/current_work/webrev/
>> ---------------
>> Problem:
>> NMT assumes thread stack boundaries to be page aligned. This is the
>> case on most OSes, but does not necessarily have to be. POSIX
>> certainly does not guarantee any alignment for pthread stack
>> boundaries. Implementors of pthread libraries are free to provide
>> stack memory as they see fit. Since some form of commit management of
>> thread stacks makes sense, and that has to be page size based, usually
>> thread stack boundaries happen to be on page borders, but this is not
>> a requirement.
>> On AIX, stack boundaries (which we get reported by the pthread
>> library) are not aligned to page size. For the stack end, this does
>> not matter: Thread->current_stack_{base|size} in the VM is, after all,
>> only our own notion of the real thread stack size. We can move up that
>> imaginary border in our head to the next larger page boundary with
>> impunity - since this only affects the part of the thread stack not
>> yet used. We just loose a bit of thread stack range.
>> In fact, on AIX, we do just that - align up the end of the stack to
>> the next page boundary, to be able place VM guard pages.
> NMT virtual memory tracking is under assumption that the memory address and
> range are page aligned. I guess we made wrong assumption about thread stack
> from very beginning, which does not seem to belong to this category now.
>> However, wrt the thread stack base the matter is different. This part
>> of the stack is already in use by the time we initialize the VM. So,
>> we cannot just move our notion of the stack base up or down as we
>> please (well, maybe we could, but we do not want to). That means that
>> on AIX, thread stack base can be located in the middle of a page.
>> Now, NMT assumes stack base to be page aligned. If not, it will assert
>> or crash when printing the NMT report.
> Will this work for you? if we round down base address and size to page
> aligned in MemTracker::record/release_thread_stack() if they are not page
> aligned. Of course, it will lost some accuracy.

So I tried that. Seems to work fine but I ran into some follow up issues:

On Linux I played around with NMT and was surprised to find that the
stack region in use - above the current stack pointer and below stack
base - does not necessarily have to be committed: it seems the second
page in a thread stack is reported by mincore() as being not
residential. This is on linux x64 (ubuntu 16.4, virtualized in
vmware). Not really sure why without looking at the pthread lib source
code - maybe that belongs not really to the thread stack but some
other thread local space misreported by the pthread library to be part
of the thread stack?

And then, I think I spot the following issues in os::committed_in_range:


3129     // During shutdown, some memory goes away without properly
notifying NMT,
3130     // E.g. ConcurrentGCThread/WatcherThread can exit without
deleting thread object.
3131     // Bailout and return as not committed for now.
3132     if (mincore_return_value == -1 && errno == ENOMEM) {
3133       return false;
3134     }

here, if the reserved region spans multiple stripes, we would cancel
the mincore loop the moment we encounter a completely empty
(non-residental) page stripe? Is this intented? I think that would
lead to partly committed regions to be reported errornously as not
having any committed parts.

3137     // Process this stripe
3138     for (int vecIdx = 0; vecIdx < pages_to_query; vecIdx ++) {
3139       if ((vec[vecIdx] & 0x01) == 0) { // not committed
3140         // End of current contiguous region
3141         if (committed_start != NULL) {
3142           break;
3143         }
3144       } else { // committed

Here, at line 3142, to my understanding we have found a complete
contiguous range of residential pages. That we should report, right?
So we should return.

However we dont, since the break at 3142 will just break out of the
inner for loop at 3138, not the outer mincore-stripe-loop at 3121.
Which means for multiple stripes, we would now continue with the next
page stripe and loose the information about the contiguous memory
range encountered.

I might be wrong... what do you think?

> Thanks,
> -Zhengyu
>> My first attempt at fixing this (see above webrev) was to feed NMT a
>> corrected version of the thread stack size - just the page-aligned
>> inner portion of the stack - that way we loose a bit fidelity in NMT
>> thread stack accounting, but at least we do not crash. That makes
>> runtime errors go away, but there is a gtest which stubbornly refuses
>> to heal.
>> See CommittedVirtualMemoryTest
>> (test/hotspot/gtest/runtime/test_committed_virtualmemory.cpp): I admit
>> I do not fully understand this test. It seems to record the current
>> threads stack base and size - ok - and then query the virtual regions
>> as perceived by NMT, expecting that the stack top is at the end of a
>> committed region. But even without the matter of unaligned stack base,
>> could it not be that virtual regions in NMT are fused, e.g. if
>> multiple thread stacks are placed next to each other? So, I am not
>> sure the test if correct.
>> Would be nice if someone with more NMT knowledge could comment.
>> --
>> Please note: Since I do most of my development on Linux, I modified
>> the stack base in the preliminary patch a bit to emulate the same
>> error on Linux I get on AIX. Because AIX is a terrible platform to
>> debug on :)
>> Note that the VM usually is fine with unaligned stack bases - NMT is
>> the only part I know of which has problems with that.
>> --
>> Thanks a lot,
>> Best Regards, Thomas

More information about the hotspot-runtime-dev mailing list