RFR: 8292083: Detected container memory limit may exceed physical machine memory [v10]
Jonathan Dowland
jdowland at openjdk.org
Mon Aug 22 15:29:54 UTC 2022
> We discovered some systems configured with cgroups v1 which report a bogus container memory limit value which is above the physical memory of the host. OpenJDK then calculates flags such as InitialHeapSize based on this invalid value; this can be larger than the available memory which can result in the OS terminating the process due to OOM.
>
> hotspot's container awareness attempts to sanity check the limit value by ensuring it's below `_unlimited_memory = (LONG_MAX / os::vm_page_size()) * os::vm_page_size()`, but that still leaves a large range of potential invalid values between physical RAM and that ceiling value.
>
> Cgroups V1 in particular returns an uninitialised value for the memory limit when one has not been explicitly set. Cgroups v2 does not suffer the same problem: however, it's possible for any value to be set for the max memory, including values exceeding the available physical memory, in either v1 or v2.
>
> This fixes the problem in two places. Further work may be required in the area of Java metrics / MXBeans. I'd also look again at whether the existing ceiling value `_unlimited_memory` serves any useful purpose. I personally don't feel those improvements should hold up this fix.
Jonathan Dowland has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits:
- Merge remote-tracking branch 'origin/master' into 8292083-cgroups-badmaxmem
- Replace _unlimited_memory with calls to os::Linux
_unlimited_memory was a constant in cgroupV1Subsystem_linux which is
initialised to a very large number and used as a ceiling sanity check
when reading a number of memory-related cgroup limits. This was not
sufficient to rule out all possible bad values from cgroups and so a
lower ceiling, set to the host's physical RAM, was needed (8292083)
Eliminate _unlimited_memory which is superfluous and use the host
physical memory instead.
For memory_and_swap_limit_in_bytes we need a higher limit than
physical RAM, so extend os::Linux to report on the host's configured
swap value and combine the two.
- Remove cgroup sanity checking logic from os::Linux::available_memory
and rely upon it from os::physical_memory instead.
- tidy up log_debug calls in os::physical_memory
Thanks to Ioi Lam for the suggestion.
- Simplify testContainerMemExceedsPhysical, avoid OperatingSystemMXBean
Rewrite the test to run two containers. First time, capture the logging
to get the reported physical memory size. Derive a bad value from this
(*10). Second run, set the container memory limit to the bad value.
Check the trace log for a line indicating this was detected and ignored.
- debug log physical memory (not cgroup constrained)
- fixup! Don't sanity check mem limit in OSContainer::init
Remove unneeded local variable host_memory
- Remove set_physical_memory (unneeded)
Cgroups code used this to override the real host RAM value with the
container memory limit. We don't do this any more so this routine
is not needed. Linux::physical_memory()/_physical_memory will now
always correspond to the hosts physical RAM, unaffected by cgroups
limits.
- Don't sanity check mem limit in OSContainer::init
Only do so in os::physical_memory()
- Rename to more descriptive testContainerMemExceedsPhysical
- ... and 6 more: https://git.openjdk.org/jdk/compare/e5619339...a88bb620
-------------
Changes: https://git.openjdk.org/jdk/pull/9880/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=9880&range=09
Stats: 75 lines in 6 files changed: 45 ins; 18 del; 12 mod
Patch: https://git.openjdk.org/jdk/pull/9880.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/9880/head:pull/9880
PR: https://git.openjdk.org/jdk/pull/9880
More information about the hotspot-runtime-dev
mailing list