RFR: 8303215: Make thread stacks not use huge pages
Thomas Stuefe
stuefe at openjdk.org
Tue May 23 18:41:56 UTC 2023
On Tue, 23 May 2023 18:01:51 GMT, Poonam Bajaj <poonam at openjdk.org> wrote:
> When a system has Transparent Huge Pages (THP) enabled (/sys/kernel/mm/transparent_hugepage/enabled is set to 'always'), thread stacks can have significantly more resident set size (RSS) than they actually require. This occurs when the stack size is 2MB or larger, which makes the memory range of the stack more likely to be aligned on a 2MB boundary. This in turn makes the stack eligible to be backed by transparent huge pages resulting in more memory consumption than it would otherwise when standard small pages are used. This issue is more apparent on AArch64 platforms where the default stack size is 2MB.
>
>
> Example mapping from smaps illustrating this issue:
> fffced200000-fffced204000 ---p 00000000 00:00 0
> Size: 16 kB # guard pages
> KernelPageSize: 4 kB
> MMUPageSize: 4 kB
> ...
> fffced204000-fffced400000 rw-p 00000000 00:00 0
> Size: 2032 kB # stack space
> KernelPageSize: 4 kB
> MMUPageSize: 4 kB
> Rss: 2032 kB # entire stack resident in memory
>
>
> This fix addresses this issue with the following two main changes:
>
> 1. Change the default stack size to 2040KB, which is 2 pages less than 2MB. This ensures that stacks don't get 2MB aligned. And why 2 pages less than 2MB, because for non-JavaThreads, glibc adds an additional guard page to the total stack size. To keep it simple and to keep the default stack size value for all options - ThreadStackSize, CompilerThreadStackSize, and VMThreadStackSize same, we use the default value as 2040K.
>
> Example mapping for a JavaThread:
>
> ffff6e913000-ffff6e917000 ---p 00000000 00:00 0
> Size: 16 kB
> KernelPageSize: 4 kB
> MMUPageSize: 4 kB
> ...
> ffff6e917000-ffff6eb11000 rw-p 00000000 00:00 0
> Size: 2024 kB
> KernelPageSize: 4 kB
> MMUPageSize: 4 kB
> Rss: 92 kB
>
> Example Mapping for a non-JavaThread (WatcherThread):
>
> ffff6eb11000-ffff6eb12000 ---p 00000000 00:00 0
> Size: 4 kB
> KernelPageSize: 4 kB
> MMUPageSize: 4 kB
> ...
> ffff6eb12000-ffff6ed10000 rw-p 00000000 00:00 0
> Size: 2040 kB
> KernelPageSize: 4 kB
> MMUPageSize: 4 kB
> Rss: 12 kB
>
>
> 2. If the requested stack size is greater than 2MB and can be 2MB aligned, then add an additional page to the stack size. This reduces the chances of stacks getting large page aligned.
Yes, this is a pragmatic workaround. I believe @theRealAph recommended the same solution.
Did you test this on an aarch64 kernel with 64k pages? I suspect the 2040 default size only works for 4k or 8k pages, for larger page sizes pthread_create would just round up the stack size again to 2MB.
src/hotspot/os/linux/os_linux.cpp line 929:
> 927: }
> 928: assert(is_aligned(stack_size, os::vm_page_size()), "stack_size not aligned");
> 929:
Small misgiving here about having to scan /proc/meminfo anew for every thread start. The problem is os::large_page_init() bailing out early if UseLargePages=0, right, so we cannot rely on large page initialization?
-------------
PR Review: https://git.openjdk.org/jdk/pull/14105#pullrequestreview-1440335272
PR Review Comment: https://git.openjdk.org/jdk/pull/14105#discussion_r1202826954
More information about the hotspot-runtime-dev
mailing list