RFR: 8324580: SIGFPE on THP initialization on kernels < 4.10 [v4]

Stefan Johansson sjohanss at openjdk.org
Thu Feb 8 11:07:00 UTC 2024


On Wed, 31 Jan 2024 16:42:16 GMT, Zdenek Zambersky <zzambers at openjdk.org> wrote:

>> **Problem:**
>> When THP is enabled, JDK reads `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size` file to [detect](https://github.com/openjdk/jdk/blob/96607df7f055a80d56ea4c19f3f4fcb32838b1f8/src/hotspot/os/linux/hugepages.cpp#L206) large page size. However this file only [appeared](https://github.com/torvalds/linux/commit/49920d28781dcced10cd30cb9a938e7d045a1c94) in kernel 4.10 and does not exist on kernel old kernels (such as 3.10 used by RHEL-7).
>> 
>> This results in detected large page size of 0B and crash, when `-XX:+UseTransparentHugePages` is used on old kernel.
>> 
>> gdb --args jdk-23+6/bin/java -XX:+UseTransparentHugePages -Xmx128m -Xlog:pagesize -version
>> ...
>> [0.005s][info][pagesize] Static hugepage support:
>> [0.005s][info][pagesize]   hugepage size: 2M
>> [0.005s][info][pagesize]   hugepage size: 1G
>> [0.005s][info][pagesize]   default hugepage size: 2M
>> [0.005s][info][pagesize] Transparent hugepage (THP) support:
>> [0.005s][info][pagesize]   THP mode: always
>> [0.005s][info][pagesize]   THP pagesize: 0B
>> [0.005s][info][pagesize] Shared memory transparent hugepage (THP) support:
>> [0.005s][info][pagesize]   Shared memory THP mode: unknown
>> [0.005s][info][pagesize] JVM will attempt to prevent THPs in thread stacks.
>> [0.005s][info][pagesize] UseLargePages=1, UseTransparentHugePages=1
>> [0.005s][info][pagesize] Large page support enabled. Usable page sizes: 4k. Default large page size: 0B.
>> 
>> Program received signal SIGFPE, Arithmetic exception.
>> [Switching to Thread 0x7ffff7fc2700 (LWP 31385)]
>> 0x00007ffff68eeb20 in lcm(unsigned long, unsigned long) () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> (gdb) bt
>> #0  0x00007ffff68eeb20 in lcm(unsigned long, unsigned long) () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> #1  0x00007ffff64a68f2 in Arguments::apply_ergo() () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> #2  0x00007ffff700458f in Threads::create_vm(JavaVMInitArgs*, bool*) () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> #3  0x00007ffff6a2373f in JNI_CreateJavaVM () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> #4  0x00007ffff7fe704b in InitializeJVM (ifn=<synthetic pointer>, penv=0x7ffff7fc1ea8, pvm=0x7ffff7fc1ea0) at src/java.base/share/native/libjli/java.c:1550
>> #5  JavaMain (_args=<optimized out>) at src/java.base/share/native/libjli/java.c:491
>> #6  0x00007ffff7feb1c9 in ThreadJavaMain (args=<optimized out>) at src/java.base/unix/native/libjli/java_md.c:650
>> #7  0x00007ffff7bc6ea5 in start_thread (arg...
>
> Zdenek Zambersky has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Disable THP for too large page sizes

@zzambers, I'm ok with not moving the logic to keep the class true to the system conf. But I see one problem with using `os::large_page_size()` for `THPStackMitigation`. That mitigation is in play even if transparent huge pages are not enabled explicitly for the JVM (`-XX:+UseTransparentHugePages`), ie the THP kernel mode is `always`. I guess it is very unlikely, but a user could be setting `LargePageSizeInBytes=1g` to use 1g pages for the heap for example. This would set the default large page size to 1g and it would be returned by `os::large_page_size()`. We should not use that page size for the stack mitigation, since it will still be subject to the real "thp pagesize".  

For this reason it would have been good if `thp_pagesize()` returned a proper value, but again, I see the arguments for keeping it true to the configuration.

Since the mitigation haven't worked on old kernels up until now (since the THP pagesize returned is 0), I guess that could be handled as a separate issue.

src/hotspot/os/linux/os_linux.cpp line 3889:

> 3887:             return;
> 3888:         }
> 3889:         _large_page_size = HugePages::default_static_hugepage_size();

Maybe extract this part to a helper, something like: `fallback_thp_pagesize()`

-------------

PR Review: https://git.openjdk.org/jdk/pull/17545#pullrequestreview-1869815986
PR Review Comment: https://git.openjdk.org/jdk/pull/17545#discussion_r1482766220


More information about the hotspot-runtime-dev mailing list