RFR: 8292083: Detected container memory limit may exceed physical machine memory [v2]
Jonathan Dowland
jdowland at openjdk.org
Wed Aug 17 13:39:52 UTC 2022
> We discovered some systems configured with cgroups v1 which report a bogus container memory limit value which is above the physical memory of the host. OpenJDK then calculates flags such as InitialHeapSize based on this invalid value; this can be larger than the available memory which can result in the OS terminating the process due to OOM.
>
> hotspot's container awareness attempts to sanity check the limit value by ensuring it's below `_unlimited_memory = (LONG_MAX / os::vm_page_size()) * os::vm_page_size()`, but that still leaves a large range of potential invalid values between physical RAM and that ceiling value.
>
> Cgroups V1 in particular returns an uninitialised value for the memory limit when one has not been explicitly set. Cgroups v2 does not suffer the same problem: however, it's possible for any value to be set for the max memory, including values exceeding the available physical memory, in either v1 or v2.
>
> This fixes the problem in two places. Further work may be required in the area of Java metrics / MXBeans. I'd also look again at whether the existing ceiling value `_unlimited_memory` serves any useful purpose. I personally don't feel those improvements should hold up this fix.
Jonathan Dowland has updated the pull request incrementally with two additional commits since the last revision:
- Separate out debug logging for three invalid memory limit scenarios
Refactor the ternary expression into an if/else chain and expand it
to the third case (memory limit equal to or exceeding physical RAM)
Format the trace log message for that case to match that of the other
two
Adjust the other two to incorporate physical RAM into the log message
- Ensure trace log is enabled before trace logging
Thanks Severin
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/9880/files
- new: https://git.openjdk.org/jdk/pull/9880/files/2de864c8..8d7e80c6
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=9880&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=9880&range=00-01
Stats: 11 lines in 2 files changed: 8 ins; 0 del; 3 mod
Patch: https://git.openjdk.org/jdk/pull/9880.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/9880/head:pull/9880
PR: https://git.openjdk.org/jdk/pull/9880
More information about the hotspot-runtime-dev
mailing list