Review request (hs24): 8007074: SIGSEGV at ParMarkBitMap::verify_clear()
Stefan Karlsson
stefan.karlsson at oracle.com
Tue Jul 2 09:57:00 PDT 2013
http://cr.openjdk.java.net/~stefank/8007074/webrev.00/
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8007074
The default way of using Large Pages in HotSpot on Linux (UseHugeTLBFS)
is broken. This is causing a number of crashes in different subsystems
of the JVM.
Bug Description
===============
The main reason for this bug is that mmap(addr, size, ...
MAP_FIXED|MAP_HUGETLB ...) will remove the previous mapping at [addr,
addr+size) when we run out of large pages on Linux.
This affects different parts of the JVM, but the most obvious is the
allocation of the Java heap:
When the JVM starts it reserves a memory area for the entire Java heap.
We use mmap(...MAP_NORESERVE...) to reserve a contiguous chunk of memory
that no other
subsystem of the JVM, or Java program, will be allowed to mmap into.
The reservation of the memory only reflects the maximum possible heap
size, but often a smaller heap size is used if the memory pressure is
low. The part of
the heap that is actually used is committed with mmap(...MAP_FIXED...).
When the heap is growing we commit a consecutive chunk of memory after the
previously committed memory. We rely on the fact that no other thread
will mmap into the reserved memory area for the Java heap.
The actual committing of the memory is done by first trying to allocate
large pages with mmap(...MAP_FIXED|MAP_HUGETLB...), and if that fails we
call mmap with the same parameters but without the large pages flag
(MAP_HUGETLB).
Just after we have failed to mmap large pages and before the small pages
have been mmapped, there's an unmapped memory region in the middle of
the Java heap, where other threads might mmap into. When that happens
we get memory trashing and crashes.
Large Pages in HotSpot - on Linux
=================================
Currently, before the bug fix, HotSpot supports three ways of allocating
large pages on Linux.
1) -XX:+UseSHM - Commits the large pages upfront when the memory is
reserved.
2) -XX:+UseHugeTLBFS - This is the broken implementation. It's also the
default way large pages are allocated. If the OS is correctly
configured, we get these kind of large pages for three different reasons:
2.1) The user has not specified any large pages flags
2.2) The user has specified -XX:+UseLargePages
2.3) The user has specified -XX:+UseHugeTLBFS
3) Transparent Huge Pages - is supported on recent Linux Kernels. The
user can choose to configure the OS to:
3.1) completely handle the allocation of large pages, or
3.2) let the JVM advise where it would be good to allocate large pages.
There exist code for this today, that is guarded by the (2)
-XX:+UseHugeTLBFS flag.
The Proposed Patch
==================
4) Create a new flag -XX:+UseTransparentHugePages, and move the
transparent huge pages advise in (3.2) out from the (2)
-XX:+UseHugeTLBFS code.
5) Make -XX:+UseTransparentHugePages the default way to allocate large
pages if the OS supports them. It will be the only kind of large pages
we'll use if the user has not specified any large pages flags.
6) Change the order of how we choose the kind of large pages when
-XX:+UseLargePages has been specified. It used to be UseHugeTLBFS then
UseSHM, now it's UseTransparentHugePages, then UseHugeTLBFS, then UseSHM.
7) Implement a workaround fix for the (2) -XX:+UseHugeTLBFS
implementation. With the fix the large pages are committed upfront when
they are reserved. It's mostly the same way we do it for the older (1)
-XX:+UseSHM large pages. This change will fix the bug, but has a couple
of drawbacks:
7.1) We have to allocate the entire large pages memory area when it is
reserved instead of when parts of it are committed.
7.2) We can't dynamically shrink or grow the used memory in the large
pages areas.
If these restrictions are not suitable for the user, then (3)
-XX:+UseTransparentHugePages could be used instead.
8) Ignore -XX:LargePageSizeInBytes on Linux since the OS doesn't support
multiple large page sizes and both the old code and new code is broken
if the user is allowed to set it to some other value then the OS chosen
value. Warn if the user specifies a value different than the OS default
value.
Testing
=======
New unit tests have been added. These can be run in a non-product build
with:
java -XX:+ExecuteInternalVMTests -XX:+VerboseInternalVMTests <large
pages flags> -version
unit tests: with and without large pages on Linux, Windows, Solaris,
x86, x64, sparcv9.
jprt: default
jprt: -XX:+UseLargePages
jprt: -XX:+UseLargePages -XX:-UseCompressedOops
vm.quick.testlist, vm.pcl.testlist, vm.gc.testlist: multiple platforms,
with large pages on all major GCs with and without compressed oops.
SPECjbb2005 performance runs: on Linux x64 with -XX:+UseHugeTLBFS before
and after the patch.
Kitchensink: 3 days on Linux x64
thanks,
StefanK
More information about the hotspot-dev
mailing list