Large page use crashes the JVM on some Linux systems
B. Blaser
bsrbnd at gmail.com
Wed Apr 25 12:00:41 UTC 2018
[further private conversation summary]
On 24 April 2018 at 21:15, B. Blaser <bsrbnd at gmail.com> wrote:
> On 24 April 2018 at 11:47, Claes Redestad <claes.redestad at oracle.com> wrote:
>> Hi Bernard,
>>
>> On 2018-04-24 11:27, B. Blaser wrote:
>>> Hi Claes,
>>>
>>> Thanks for your feedback, I'll try to improve the fix as suggested.
>>
>> someone pointed out we already do a sanity check similar to the one you're
>> proposing..
>>
>> src/hotspot/os/linux/os_linux.cpp:
>>
>> bool os::Linux::hugetlbfs_sanity_check(bool warn, size_t page_size) {
>> [...]
>> }
>>
>> It seems it'll warn only if you explicitly use -XX:+UseHugeTLBFS.
>> -XX:+UseLargePages
>> on linux first attempts to use UseHugeTLBFS, then falls back to -XX:+UseSHM.
>>
>> ... what errors do you see on your system when you run -version with
>> -XX:+UseLargePages,
>> -XX:+UseHugeTLBFS and -XX:+UseSHM respectively? Most systems aren't
>> configured to
>> use HugeTLBFS, so my guess is your system actually has an issue with
>> UseSHM...
>
> I'm aware of this sanity check. The problem is that on my system
> 'mmap()' always fails and then the JVM attempts to use SHM instead.
> I'll check more deeply my configuration and read twice the kernel vm doc:
>
> https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
>
> but, in short terms, both 'mmap()' and SHM can access large pages (2Mb
> on my computer) but it has to be enabled (also with SHM) which doesn't
> seem to be the case by default.
>
> So, to answer your questions:
> 1) -XX:+UseLargePages and -XX:+UseHugeTLBFS have the same effect than
> -XX:UseSHM because 'mmap' nicely complains when trying to use huge TLB
> and then SHM is used instead.
> 2) unfortunately, SHM doesn't complain (no problem when calling
> 'shmget' or 'shmat') but the allocated memory isn't aligned with the
> large page size (2Mb) which crashes the JVM (SHM probably allocates
> memory using the default page size even if requesting 2Mb pages -
> which I have to verify).
>
> In conclusion, the current JVM behavior of trying to use SHM if
> 'mmap()' fails seems to be brittle.
>
> I think, we have to check if large pages are supported/enabled when
> starting the JVM.
> Probably checking '/proc/meminfo' - '/proc/filesystems' -
> '/proc/sys/vm/nr_hugepages' would be faster than calling 'mmap()'.
>
> I'll read again the kernel doc, but I think calling 'mmap()' is a
> robust "slow" way to see if large pages can be used but I agree that
> it doesn't tell if they are not *enabled* or not *supported*.
>
> What do you think we should do?
>
> Bernard
>
>> /Claes
>>
>>> Thanks,
>>> Bernard
---------------------------------------------------------
On 24 April 2018 at 21:39, Claes Redestad <claes.redestad at oracle.com> wrote:
> The root issue here could very well be that the SHM sanity test is
> insufficient. Adding the same test as we already do for TLBFS seems like the
> wrong approach.
>
> I'm not the most knowledgeable about SHM, though, in fact not knowledgeable
> at all, so let's try and get you subscribed to hotspot-dev and spark a
> discussion on the list.
>
> /Claes
In concrete terms (on my system):
$ grep "hugetlbfs" /proc/filesystems
nodev hugetlbfs
$ grep -e "HugePages_" -e "Hugepagesize" /proc/meminfo
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Which means that huge pages are supported but not configured.
$ ./build/linux-x86_64-normal-server-release/jdk/bin/java
-XX:+UseLargePages -version
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (g1PageBasedVirtualSpace.cpp:49), pid=2914, tid=2915
# guarantee(is_aligned(rs.base(), page_size)) failed: Reserved space
base 0x00007f5c20b10000 is not aligned to requested page size 2097152
#
# JRE version: (11.0) (build )
# Java VM: OpenJDK 64-Bit Server VM (11-internal+0-adhoc.devel.jdk,
mixed mode, aot, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: core.2914 (may not exist)
#
# An error report file with more information is saved as:
# /home/****/jdk/hs_err_pid2914.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)
$ ./build/linux-x86_64-normal-server-release/jdk/bin/java
-XX:+UseHugeTLBFS -version
OpenJDK 64-Bit Server VM warning: HugeTLBFS is not supported by the
operating system.
openjdk version "11-internal" 2018-09-25
OpenJDK Runtime Environment (build 11-internal+0-adhoc.devel.jdk)
OpenJDK 64-Bit Server VM (build 11-internal+0-adhoc.devel.jdk, mixed mode)
$ ./build/linux-x86_64-normal-server-release/jdk/bin/java -XX:+UseSHM -version
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (g1PageBasedVirtualSpace.cpp:49), pid=2974, tid=2975
# guarantee(is_aligned(rs.base(), page_size)) failed: Reserved space
base 0x00007f8a06890000 is not aligned to requested page size 2097152
#
# JRE version: (11.0) (build )
# Java VM: OpenJDK 64-Bit Server VM (11-internal+0-adhoc.devel.jdk,
mixed mode, aot, tiered, compressed oops, g1 gc, linux-amd64)
# Core dump will be written. Default location: core.2974 (may not exist)
#
# An error report file with more information is saved as:
# /home/****/jdk/hs_err_pid2974.log
#
# If you would like to submit a bug report, please visit:
# http://bugreport.java.com/bugreport/crash.jsp
#
Aborted (core dumped)
So, I guess the least the JVM should do is unconditionally disabling
large page use when starting if 'HugePages_Total: 0' in
'/proc/meminfo'.
But I'll investigate what can be done to improve SHM sanity check too.
Or maybe someone on hotspot-dev would have another idea?
Bernard
---------------------------------------------------------
On 23 April 2018 at 11:18, Claes Redestad <claes.redestad at oracle.com> wrote:
> [ /bcc amber-dev, /cc hotspot-dev ]
>
> Hi,
>
> unconditionally mapping and unmapping a large page on startup seems
> sub-optimal to me - could this be checked directly after
> -XX:+UseLargePages flag has been parsed?
>
> I'd also note that explicitly configured large pages are typically a limited
> resource: does this test distinguish between a failure due the system not
> supporting the feature and a failure due not having any free pages left?
> Printing a "UseLargePages is unsupported" message in the latter case
> would be misleading.
>
> I wonder if checking something like /proc/meminfo for HugePages_* is a
> more robust way to probe capabilities, and also whether this is more
> suited as a test harness feature, i.e., enhance jtreg and tag these tests
> so that they're ignored on systems that doesn't have any/enough huge
> pages.
>
> Thanks!
>
> /Claes
>
>
> On 2018-04-22 23:18, B. Blaser wrote:
>>
>> [ I've trouble subscribing to hotspot-dev, please forward if necessary. ]
>>
>> Hi,
>>
>> After a clean build, some hotspot tests related to large page use are
>> failing on my 64-bit Linux system, for example:
>>
>> gc/g1/TestLargePageUseForAuxMemory.java
>> [...]
>>
>> Or simply:
>>
>> $ ./build/linux-x86_64-normal-server-release/images/jdk/bin/java
>> -XX:+UseLargePages -version
>>
>> is crashing the JVM because the latter assumes that large pages are
>> always supported on Linux, which appears to be wrong.
>>
>> I suggest to make sure that large pages are supported when parsing the
>> arguments, as below.
>>
>> Does this look reasonable (tier1 looks better now)?
>>
>> Thanks,
>> Bernard
>>
>> diff -r 8c85a1855e10 src/hotspot/share/runtime/arguments.cpp
>> --- a/src/hotspot/share/runtime/arguments.cpp Fri Apr 13 11:14:49 2018
>> -0700
>> +++ b/src/hotspot/share/runtime/arguments.cpp Sun Apr 22 20:29:21 2018
>> +0200
>> @@ -60,6 +60,7 @@
>> #include "utilities/defaultStream.hpp"
>> #include "utilities/macros.hpp"
>> #include "utilities/stringUtils.hpp"
>> +#include "sys/mman.h"
>> #if INCLUDE_JVMCI
>> #include "jvmci/jvmciRuntime.hpp"
>> #endif
>> @@ -4107,6 +4108,18 @@
>> UNSUPPORTED_OPTION(UseLargePages);
>> #endif
>>
>> +#ifdef LINUX
>> + void *p = mmap(NULL, os::large_page_size(), PROT_READ|PROT_WRITE,
>> + MAP_ANONYMOUS|MAP_PRIVATE|MAP_HUGETLB,
>> + -1, 0);
>> + if (p != MAP_FAILED) {
>> + munmap(p, os::large_page_size());
>> + }
>> + else {
>> + UNSUPPORTED_OPTION(UseLargePages);
>> + }
>> +#endif
>> +
>> ArgumentsExt::report_unsupported_options();
>>
>> #ifndef PRODUCT
>> diff -r 8c85a1855e10
>> test/hotspot/jtreg/runtime/memory/LargePages/TestLargePagesFlags.java
>> ---
>> a/test/hotspot/jtreg/runtime/memory/LargePages/TestLargePagesFlags.java
>> Fri Apr 13 11:14:49 2018 -0700
>> +++
>> b/test/hotspot/jtreg/runtime/memory/LargePages/TestLargePagesFlags.java
>> Sun Apr 22 20:29:21 2018 +0200
>> @@ -37,7 +37,7 @@
>> public class TestLargePagesFlags {
>>
>> public static void main(String [] args) throws Exception {
>> - if (!Platform.isLinux()) {
>> + if (!Platform.isLinux() || !canUse(UseLargePages(true))) {
>> System.out.println("Skipping. TestLargePagesFlags has only been
>> implemented for Linux.");
>> return;
>> }
>
>
More information about the hotspot-dev
mailing list