RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v2]
Thomas Stuefe
stuefe at openjdk.org
Sat Dec 2 07:01:40 UTC 2023
On Fri, 1 Dec 2023 10:55:30 GMT, Stefan Karlsson <stefank at openjdk.org> wrote:
>> There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS:
>>
>>
>> if (UseTransparentHugePages && !HugePages::supports_thp()) {
>> if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) {
>> log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system.");
>> }
>> UseLargePages = UseTransparentHugePages = false;
>> return;
>> }
>>
>>
>> This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings:
>>
>> /sys/kernel/mm/transparent_hugepage/enabled: never
>> /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise
>>
>>
>> the above code will force ZGC to run without THPs.
>>
>> This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch:
>>
>> 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM.
>>
>> 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`.
>>
>> 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used.
>>
>> The result of this change can be seen in these tables:
>>
>> ZGC large pages log output:
>>
>> E (T) = Enabled (Transparent)
>> E (T, OS) = Enabled (Transparent, OS enforced)
>> D = Disabled
>> D = Disabled (OS enforced)
>>
>> -XX:+UseTransparentHugePages
>>
>> shem \ anon | always | madvise | never
>> ------------+--------+---------+-------
>> always | E (T) | E (T) | E (T)
>> within_size | E (T) | E (T) | E (T)
>> advise | E (T) | E (T) | E (T)
>> never | D (OS) | D (OS) | D (OS)
>> deny | D (OS) | D (OS) | D (OS)
>> force | E (T) | E (T) | E (T)
>>
>> -XX:-UseTransparentHugePages
>>
>> shem \ anon | always | madvise | never
>> ------------+-----------+-----------+-------
>> always | E (T, OS) | E (T, OS) | E (T, OS)
>> within_size | E (T, OS) | E (T, OS) | E (T, OS)
>> advise | D ...
>
> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision:
>
> Small tweaks
Small question:
https://wiki.openjdk.org/display/zgc/Main#Main-EnablingTransparentHugePagesOnLinux
mentions that to use THPs with ZGC, one needs both
`/sys/kernel/mm/transparent_hugepage/enabled -> "madvise"` and `/sys/kernel/mm/transparent_hugepage/shmem_enabled -> "advise"` in conjunction. Is that correct, the latter needs the former? I did not read this from https://www.kernel.org/doc/html/next/admin-guide/mm/transhuge.html.
src/hotspot/os/linux/hugepages.cpp line 321:
> 319:
> 320: const bool huge_pages_turned_off = !FLAG_IS_DEFAULT(UseLargePages) && !UseLargePages;
> 321: _thp_requested = UseTransparentHugePages && !huge_pages_turned_off;
This muddles the water a bit, since the original intent of HugePages vs whatever happens in os_linux was to let HugePages give me the unadulterated info of what the OS supports, whereas processing switches and deciding on them should happen in os_linux in large_page_init. Would it be possible to move "_thp_requested" up to the caller?
We can keep the "should_madvise_anonymous_thps" since those make sense here, but move the "requested" condition up to the caller.
src/hotspot/os/linux/os_linux.cpp line 3722:
> 3720: }
> 3721:
> 3722: log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system.");
Would it be not clearer to define when to warn, as we do in warn_no_large_pages?
Related to that, should we not warn if ZGC and +shmemthp configured but -anonymous thp? I am not sure the heap is the only part of the JVM that uses THP, and other parts would still use anon THP, or? E.g. Code heap.
Also, maybe a better message for the poor admin that tries to setup. E.g.:
bool requires_shmem_thp = UseTHP + UseZGC
bool requires_anon_thp = UseTHP
bool off = false;
if (requires_shmem && !shmem configured)
(log_warning "Shmem thp are not supported. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled to advise to support shmem thp")
off = true;
if (requires_anonthp && !anon_thp configured)
(log_warning "anonymous Thp are not supported. Set /sys/kernel/mm/transparent_hugepage/enabled to madvise")
off = true;
if (off)
UseTHP = 0
log_warning(UseTHP disabled (see previous messages)
if ZGC and !supports shmemthp or
src/hotspot/os/linux/os_linux.cpp line 3736:
> 3734: ls.print_cr(". Default large page size: " EXACTFMT ".", EXACTFMTARGS(os::large_page_size()));
> 3735: } else {
> 3736: ls.print("Large page support %sdisabled.", uses_zgc_shmem_thp() ? "partially " : "");
I wonder whether we could make our life simpler by not supporting mixes: we could require that for ZGC, to use THP, both shmen and anon thps have to be active. Would that be acceptable or do you think there are too many misconfigured systems out there?
-------------
PR Review: https://git.openjdk.org/jdk/pull/16690#pullrequestreview-1760790216
PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1412735114
PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1412736663
PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1412737495
More information about the hotspot-dev
mailing list