Review request (hs24): 8007074: SIGSEGV at ParMarkBitMap::verify_clear()
Volker Simonis
volker.simonis at gmail.com
Tue Jul 16 10:21:58 PDT 2013
Hi Stefan,
this is a very interesting change! Although I haven’t had a chance to
dig trough the whole change and test it, I want to make a few comments
beforehand:
I think this change will help to make 'UseHugeTLBFS' work on
Linux/PPC64. The problem with the current strategy on PPC64 is that on
PPC64 we have different memory slices (256M slices below 4G, then 4G
slices below 1TB and finally 1TB slices above that) and each slice
supports only one page size (I got this information from Tiago Stürmer
Daitx from the IBM Linux Technology Center:
http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/2013-April/000445.html).
So with the old strategy, the first mmap(MAP_NORESERVE) ends up in a
memory slice with small pages and the subsequent
mmap(MAP_FIXED,MAP_HUGETLB) will fail because the reserved memory is
already located in a slice with small pages only. So I think your
change where the large pages are committed upfront will solve this
problem (though I haven’t tired until now).
Currently, Linux/PPC64 doesn't support transparent huge pages, but
there seems to be a new implementation in the upcoming Linux Kernel
3.11 (see: http://lkml.indiana.edu/hypermail/linux/kernel/1307.0/01916.html).
I'm wondering how this can work with the 'memory slicing' mentioned
before but that's another question which I'll try to find out from the
IBM colleagues.
I've also collected all kinds of information regarding
LargePages/mmap/etc on Linux which I'd like to put in the HotSpot
Wiki. I'd also like to add the explanations from your mail. Would it
be OK for you if I'd create a new page (i.e. under
HotSpot->Runtime->LargePageSupport) where we could collect this
information?
How did you test transparent huge page support and did you compare it
with the old UseHugeTLBFS? I wonder how transparent huge page support
works in the real world, because madvise is after all only an advise.
The result depends on the kernel settings
(/sys/kernel/mm/transparent_hugepage/) as well as on the fact how well
the 'khugepaged' works. If I read the transparent huge page
documentation (http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/transhuge.txt)
it seems to me that there are still a lot of tuning paramters. I agree
that the general usage is easier compared to 'UseHugeTLBFS' but on the
other hand, once UseHugeTLBFS succeeds we have the huge pages forever.
And finally, are these changes really intended for hs24 as denoted in
the subject? Then I don't understand your comment that "Unfortunately,
it's not likely that we'll get this into the hs24 release."
Thank you and best regards,
Volker
PS: by the way, where did you get the information from that a failing
"mmap(addr, size, ... MAP_FIXED|MAP_HUGETLB ...)" will remove the
previous mapping? I couldn't find t hat anywhere. (I think this may
also be one of the reasons why we sometimes loose the guard page
protection for a thread. We thought we fixed that problem ("7107135 :
Stack guard pages are no more protected after loading a shared library
with executable stack",
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7107135) but maybe
we only fixed one reason? What if a thread places it's stack into the
area of the removed mapping and afterwards, when the memory is mmaped
with small pages, the guard pages of the thread will become read/write
permissions.)
On Tue, Jul 2, 2013 at 6:57 PM, Stefan Karlsson
<stefan.karlsson at oracle.com> wrote:
> http://cr.openjdk.java.net/~stefank/8007074/webrev.00/
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007074
>
> The default way of using Large Pages in HotSpot on Linux (UseHugeTLBFS) is
> broken. This is causing a number of crashes in different subsystems of the
> JVM.
>
>
> Bug Description
> ===============
>
> The main reason for this bug is that mmap(addr, size, ...
> MAP_FIXED|MAP_HUGETLB ...) will remove the previous mapping at [addr,
> addr+size) when we run out of large pages on Linux.
>
> This affects different parts of the JVM, but the most obvious is the
> allocation of the Java heap:
>
> When the JVM starts it reserves a memory area for the entire Java heap. We
> use mmap(...MAP_NORESERVE...) to reserve a contiguous chunk of memory that
> no other
> subsystem of the JVM, or Java program, will be allowed to mmap into.
>
> The reservation of the memory only reflects the maximum possible heap size,
> but often a smaller heap size is used if the memory pressure is low. The
> part of
> the heap that is actually used is committed with mmap(...MAP_FIXED...). When
> the heap is growing we commit a consecutive chunk of memory after the
> previously committed memory. We rely on the fact that no other thread will
> mmap into the reserved memory area for the Java heap.
>
> The actual committing of the memory is done by first trying to allocate
> large pages with mmap(...MAP_FIXED|MAP_HUGETLB...), and if that fails we
> call mmap with the same parameters but without the large pages flag
> (MAP_HUGETLB).
>
> Just after we have failed to mmap large pages and before the small pages
> have been mmapped, there's an unmapped memory region in the middle of the
> Java heap, where other threads might mmap into. When that happens we get
> memory trashing and crashes.
>
>
> Large Pages in HotSpot - on Linux
> =================================
>
> Currently, before the bug fix, HotSpot supports three ways of allocating
> large pages on Linux.
> 1) -XX:+UseSHM - Commits the large pages upfront when the memory is
> reserved.
>
> 2) -XX:+UseHugeTLBFS - This is the broken implementation. It's also the
> default way large pages are allocated. If the OS is correctly configured, we
> get these kind of large pages for three different reasons:
> 2.1) The user has not specified any large pages flags
> 2.2) The user has specified -XX:+UseLargePages
> 2.3) The user has specified -XX:+UseHugeTLBFS
>
> 3) Transparent Huge Pages - is supported on recent Linux Kernels. The user
> can choose to configure the OS to:
> 3.1) completely handle the allocation of large pages, or
> 3.2) let the JVM advise where it would be good to allocate large pages.
> There exist code for this today, that is guarded by the (2)
> -XX:+UseHugeTLBFS flag.
>
>
> The Proposed Patch
> ==================
>
> 4) Create a new flag -XX:+UseTransparentHugePages, and move the transparent
> huge pages advise in (3.2) out from the (2) -XX:+UseHugeTLBFS code.
>
> 5) Make -XX:+UseTransparentHugePages the default way to allocate large pages
> if the OS supports them. It will be the only kind of large pages we'll use
> if the user has not specified any large pages flags.
>
> 6) Change the order of how we choose the kind of large pages when
> -XX:+UseLargePages has been specified. It used to be UseHugeTLBFS then
> UseSHM, now it's UseTransparentHugePages, then UseHugeTLBFS, then UseSHM.
>
> 7) Implement a workaround fix for the (2) -XX:+UseHugeTLBFS implementation.
> With the fix the large pages are committed upfront when they are reserved.
> It's mostly the same way we do it for the older (1) -XX:+UseSHM large pages.
> This change will fix the bug, but has a couple of drawbacks:
> 7.1) We have to allocate the entire large pages memory area when it is
> reserved instead of when parts of it are committed.
> 7.2) We can't dynamically shrink or grow the used memory in the large pages
> areas.
> If these restrictions are not suitable for the user, then (3)
> -XX:+UseTransparentHugePages could be used instead.
>
> 8) Ignore -XX:LargePageSizeInBytes on Linux since the OS doesn't support
> multiple large page sizes and both the old code and new code is broken if
> the user is allowed to set it to some other value then the OS chosen value.
> Warn if the user specifies a value different than the OS default value.
>
>
> Testing
> =======
>
> New unit tests have been added. These can be run in a non-product build
> with:
> java -XX:+ExecuteInternalVMTests -XX:+VerboseInternalVMTests <large pages
> flags> -version
>
> unit tests: with and without large pages on Linux, Windows, Solaris, x86,
> x64, sparcv9.
> jprt: default
> jprt: -XX:+UseLargePages
> jprt: -XX:+UseLargePages -XX:-UseCompressedOops
> vm.quick.testlist, vm.pcl.testlist, vm.gc.testlist: multiple platforms, with
> large pages on all major GCs with and without compressed oops.
> SPECjbb2005 performance runs: on Linux x64 with -XX:+UseHugeTLBFS before and
> after the patch.
> Kitchensink: 3 days on Linux x64
>
>
> thanks,
> StefanK
More information about the hotspot-dev
mailing list