RFR: 8292083: Detected container memory limit may exceed physical machine memory [v14]

Severin Gehwolf sgehwolf at openjdk.org
Tue Aug 23 12:54:46 UTC 2022


On Tue, 23 Aug 2022 11:20:41 GMT, Jonathan Dowland <jdowland at openjdk.org> wrote:

>> We discovered some systems configured with cgroups v1 which report a bogus container memory limit value which is above the physical memory of the host. OpenJDK then calculates flags such as InitialHeapSize based on this invalid value; this can be larger than the available memory which can result in the OS terminating the process due to OOM.
>> 
>> hotspot's container awareness attempts to sanity check the limit value by ensuring it's below `_unlimited_memory = (LONG_MAX / os::vm_page_size()) * os::vm_page_size()`, but that still leaves a large range of potential invalid values between physical RAM and that ceiling value.
>> 
>> Cgroups V1 in particular returns an uninitialised value for the memory limit when one has not been explicitly set. Cgroups v2 does not suffer the same problem: however, it's possible for any value to be set for the max memory, including values exceeding the available physical memory, in either v1 or v2.
>> 
>> This fixes the problem in two places. Further work may be required in the area of Java metrics / MXBeans. I'd also look again at whether the existing ceiling value `_unlimited_memory` serves any useful purpose. I personally don't feel those improvements should hold up this fix.
>
> Jonathan Dowland has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Move cgroup max memory sanity checking to
>   
>   The sanity checking now takes place on the right side of the value
>   caching.
>   
>   Now os::available_memory and os::physical_memory are (almost) unmodified
>   from master

This looks much better. Thanks!

src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 577:

> 575:     log_debug(os, container)("container memory limit %s: " JLONG_FORMAT ", using host value " JLONG_FORMAT,
> 576:                              reason, mem_limit, phys_mem);
> 577:     mem_limit = mem_limit == -2 ? -2 : -1;

I take it, the reason for this line is so that we have appropriate info for the log line? A slightly clearer way (to me) would be to use another local, `read_mem_limit` for example, that is used for logging and set `mem_limit` to the desired value in the conditional branches before. Also, I think this code deserves a comment. For example:


    jlong read_mem_limit = mem_limit;
    if (mem_limit >= phys_mem) {
      // Exceeding physical memory counts as unlimited. cg v1 is bound
      // above by phys_mem (as there is no 'max' in interface files), but cg v2
      // might return a value > phys_mem if the container engine was started
      // with a memory flag exceeding phys_mem, so we need to account for it
      // here.
      reason = "ignored";
      mem_limit = -1;
    } else if (OSCONTAINER_ERROR == mem_limit) {
      reason = "failed";
    } else {
      assert(mem_limit == -1, "Expected unlimited");
      reason = "unlimited";
    }
    log_debug(os, container)("container memory limit %s: " JLONG_FORMAT ", using host value " JLONG_FORMAT,
                             reason, read_mem_limit, phys_mem);

src/hotspot/os/linux/os_linux.cpp line 227:

> 225:     }
> 226:     log_debug(os, container)("container memory limit %s: " JLONG_FORMAT ", using host value",
> 227:                             mem_limit == OSCONTAINER_ERROR ? "failed" : "unlimited", mem_limit);

It seems the equivalent line in `os::Linux::available_memory()` can get removed too, as we now log this info as part of `OSContainer::memory_limit_in_bytes()`

-------------

PR: https://git.openjdk.org/jdk/pull/9880


More information about the hotspot-runtime-dev mailing list