RFR (preliminary): 8202772: "NMT erroneously assumes thread stack boundaries to be page aligned"

Zhengyu Gu zgu at redhat.com
Thu Jun 7 12:42:23 UTC 2018


Hi Thomas,

On 06/07/2018 06:49 AM, Thomas Stüfe wrote:
> Hi all,
> 
> I could use some input/advice for:
> 
> https://bugs.openjdk.java.net/browse/JDK-8202772
> 
> This is my - yet incomplete - fix attempt:
> 
> http://cr.openjdk.java.net/~stuefe/webrevs/8202772-NMT-erroneously-assumes-thread-stack-boundaries-to-be-page-aligned/current_work/webrev/
> 
> ---------------
> 
> Problem:
> 
> NMT assumes thread stack boundaries to be page aligned. This is the
> case on most OSes, but does not necessarily have to be. POSIX
> certainly does not guarantee any alignment for pthread stack
> boundaries. Implementors of pthread libraries are free to provide
> stack memory as they see fit. Since some form of commit management of
> thread stacks makes sense, and that has to be page size based, usually
> thread stack boundaries happen to be on page borders, but this is not
> a requirement.
> 
> On AIX, stack boundaries (which we get reported by the pthread
> library) are not aligned to page size. For the stack end, this does
> not matter: Thread->current_stack_{base|size} in the VM is, after all,
> only our own notion of the real thread stack size. We can move up that
> imaginary border in our head to the next larger page boundary with
> impunity - since this only affects the part of the thread stack not
> yet used. We just loose a bit of thread stack range.
> 
> In fact, on AIX, we do just that - align up the end of the stack to
> the next page boundary, to be able place VM guard pages.

NMT virtual memory tracking is under assumption that the memory address 
and range are page aligned. I guess we made wrong assumption about 
thread stack from very beginning, which does not seem to belong to this 
category now.

> 
> However, wrt the thread stack base the matter is different. This part
> of the stack is already in use by the time we initialize the VM. So,
> we cannot just move our notion of the stack base up or down as we
> please (well, maybe we could, but we do not want to). That means that
> on AIX, thread stack base can be located in the middle of a page.
> 
> Now, NMT assumes stack base to be page aligned. If not, it will assert
> or crash when printing the NMT report.

Will this work for you? if we round down base address and size to page 
aligned in MemTracker::record/release_thread_stack() if they are not 
page aligned. Of course, it will lost some accuracy.


Thanks,

-Zhengyu

> 
> My first attempt at fixing this (see above webrev) was to feed NMT a
> corrected version of the thread stack size - just the page-aligned
> inner portion of the stack - that way we loose a bit fidelity in NMT
> thread stack accounting, but at least we do not crash. That makes
> runtime errors go away, but there is a gtest which stubbornly refuses
> to heal.
> 
> See CommittedVirtualMemoryTest
> (test/hotspot/gtest/runtime/test_committed_virtualmemory.cpp): I admit
> I do not fully understand this test. It seems to record the current
> threads stack base and size - ok - and then query the virtual regions
> as perceived by NMT, expecting that the stack top is at the end of a
> committed region. But even without the matter of unaligned stack base,
> could it not be that virtual regions in NMT are fused, e.g. if
> multiple thread stacks are placed next to each other? So, I am not
> sure the test if correct.
> 
> Would be nice if someone with more NMT knowledge could comment.
> 
> --
> 
> Please note: Since I do most of my development on Linux, I modified
> the stack base in the preliminary patch a bit to emulate the same
> error on Linux I get on AIX. Because AIX is a terrible platform to
> debug on :)
> 
> Note that the VM usually is fine with unaligned stack bases - NMT is
> the only part I know of which has problems with that.
> 
> --
> 
> Thanks a lot,
> 
> Best Regards, Thomas
> 


More information about the hotspot-runtime-dev mailing list