RFR: 8324580: SIGFPE on THP initialization on kernels < 4.10 [v4]

Zdenek Zambersky zzambers at openjdk.org
Wed Jan 31 17:13:06 UTC 2024


On Wed, 31 Jan 2024 16:42:16 GMT, Zdenek Zambersky <zzambers at openjdk.org> wrote:

>> **Problem:**
>> When THP is enabled, JDK reads `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size` file to [detect](https://github.com/openjdk/jdk/blob/96607df7f055a80d56ea4c19f3f4fcb32838b1f8/src/hotspot/os/linux/hugepages.cpp#L206) large page size. However this file only [appeared](https://github.com/torvalds/linux/commit/49920d28781dcced10cd30cb9a938e7d045a1c94) in kernel 4.10 and does not exist on kernel old kernels (such as 3.10 used by RHEL-7).
>> 
>> This results in detected large page size of 0B and crash, when `-XX:+UseTransparentHugePages` is used on old kernel.
>> 
>> gdb --args jdk-23+6/bin/java -XX:+UseTransparentHugePages -Xmx128m -Xlog:pagesize -version
>> ...
>> [0.005s][info][pagesize] Static hugepage support:
>> [0.005s][info][pagesize]   hugepage size: 2M
>> [0.005s][info][pagesize]   hugepage size: 1G
>> [0.005s][info][pagesize]   default hugepage size: 2M
>> [0.005s][info][pagesize] Transparent hugepage (THP) support:
>> [0.005s][info][pagesize]   THP mode: always
>> [0.005s][info][pagesize]   THP pagesize: 0B
>> [0.005s][info][pagesize] Shared memory transparent hugepage (THP) support:
>> [0.005s][info][pagesize]   Shared memory THP mode: unknown
>> [0.005s][info][pagesize] JVM will attempt to prevent THPs in thread stacks.
>> [0.005s][info][pagesize] UseLargePages=1, UseTransparentHugePages=1
>> [0.005s][info][pagesize] Large page support enabled. Usable page sizes: 4k. Default large page size: 0B.
>> 
>> Program received signal SIGFPE, Arithmetic exception.
>> [Switching to Thread 0x7ffff7fc2700 (LWP 31385)]
>> 0x00007ffff68eeb20 in lcm(unsigned long, unsigned long) () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> (gdb) bt
>> #0  0x00007ffff68eeb20 in lcm(unsigned long, unsigned long) () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> #1  0x00007ffff64a68f2 in Arguments::apply_ergo() () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> #2  0x00007ffff700458f in Threads::create_vm(JavaVMInitArgs*, bool*) () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> #3  0x00007ffff6a2373f in JNI_CreateJavaVM () from /home/tester/jdk-23+6/lib/server/libjvm.so
>> #4  0x00007ffff7fe704b in InitializeJVM (ifn=<synthetic pointer>, penv=0x7ffff7fc1ea8, pvm=0x7ffff7fc1ea0) at src/java.base/share/native/libjli/java.c:1550
>> #5  JavaMain (_args=<optimized out>) at src/java.base/share/native/libjli/java.c:491
>> #6  0x00007ffff7feb1c9 in ThreadJavaMain (args=<optimized out>) at src/java.base/unix/native/libjli/java_md.c:650
>> #7  0x00007ffff7bc6ea5 in start_thread (arg...
>
> Zdenek Zambersky has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Disable THP for too large page sizes

When testing this on older kernel I have ran into failure of `runtime/os/TestTracePageSizes.java` test:

java.lang.AssertionError: Page sizes mismatch: 4 != 2048
	at TestTracePageSizes.main(TestTracePageSizes.java:307)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
	at java.base/java.lang.Thread.run(Thread.java:1575)

As far as I can tell this is test only problem.

**Details:**
Test [relaxes](https://github.com/openjdk/jdk/blob/2cd1ba6a52eafffa65d0f2532a07fff89f9cea0e/test/hotspot/jtreg/runtime/os/TestTracePageSizes.java#L301) page size check if mapping uses THP. [Detection logic](https://github.com/openjdk/jdk/blob/2cd1ba6a52eafffa65d0f2532a07fff89f9cea0e/test/hotspot/jtreg/runtime/os/TestTracePageSizes.java#L368) depends of `THPeligible` line from smaps, when THP mode is `always` (default on RHEL-7). This information however only [appeared in kernel 5.0](https://github.com/torvalds/linux/commit/7635d9cbe8327e131a1d3d8517dc186c2796ce2e). All tests pass with THP mode switched to `madvise` on RHEL-7.

Test is already being skipped in some configurations. So It could probably be skipped also for `kernel < 5.0 && thp_mode == always`.

I can include fix in this PR.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17545#issuecomment-1919543589


More information about the hotspot-runtime-dev mailing list