RFR: 8261242: [Linux] OSContainer::is_containerized() returns true when run outside a container [v2]

Thomas Stuefe stuefe at openjdk.org
Tue Apr 16 18:29:04 UTC 2024


On Thu, 11 Apr 2024 12:08:02 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:

>> Please review this enhancement to the container detection code which allows it to figure out whether the JVM is actually running inside a container (`podman`, `docker`, `crio`), or with some other means that enforces memory/cpu limits by means of the cgroup filesystem. If neither of those conditions hold, the JVM runs in not containerized mode, addressing the issue described in the JBS tracker. For example, on my Linux system `is_containerized() == false" is being indicated with the following trace log line:
>> 
>> 
>> [0.001s][debug][os,container] OSContainer::init: is_containerized() = false because no cpu or memory limit is present
>> 
>> 
>> This state is being exposed by the Java `Metrics` API class using the new (still JDK internal) `isContainerized()` method. Example:
>> 
>> 
>> java -XshowSettings:system --version
>> Operating System Metrics:
>>     Provider: cgroupv1
>>     System not containerized.
>> openjdk 23-internal 2024-09-17
>> OpenJDK Runtime Environment (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk)
>> OpenJDK 64-Bit Server VM (fastdebug build 23-internal-adhoc.sgehwolf.jdk-jdk, mixed mode, sharing)
>> 
>> 
>> The basic property this is being built on is the observation that the cgroup controllers typically get mounted read only into containers. Note that the current container tests assert that `OSContainer::is_containerized() == true` in various tests. Therefore, using the heuristic of "is any memory or cpu limit present" isn't sufficient. I had considered that in an earlier iteration, but many container tests failed.
>> 
>> Overall, I think, with this patch we improve the current situation of claiming a containerized system being present when it's actually just a regular Linux system.
>> 
>> Testing:
>> 
>> - [x] GHA (risc-v failure seems infra related)
>> - [x] Container tests on Linux x86_64 of cgroups v1 and cgroups v2 (including gtests)
>> - [x] Some manual testing using cri-o
>> 
>> Thoughts?
>
> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision:
> 
>  - Merge branch 'master' into jdk-8261242-is-containerized-fix
>  - jcheck fixes
>  - Fix tests
>  - Implement Metrics.isContainerized()
>  - Some clean-up
>  - Drop cgroups testing on plain Linux
>  - Implement fall-back logic for non-ro controller mounts
>  - Make find_ro static and local to compilation unit
>  - 8261242: [Linux] OSContainer::is_containerized() returns true

I am not enough of a container expert to judge if the basic approach is right - I trust you on this. This is just a technical code review.

src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 170:

> 168:     }
> 169:   }
> 170:   return false;

An alternative, simpler, no need for modifying source string:

static bool find_ro_opt(const char* o) {
  return strcmp(o, "ro") || strstr(o, ",ro") || strstr(o, "ro,");
}

src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 351:

> 349:     //
> 350:     // We collect the read only mount option in the cgroup infos so as to have that
> 351:     // info ready when determining is_containerized().

Here, and in other places: a comment indicating the line format we scan would be appreciated, possibly with argument numbers. Saves the casual code reader from looking into proc man page. Even just pasting the example line for proc manpage would be fine (https://man7.org/linux/man-pages/man5/proc.5.html) (but with order adapted to your scanf call, they count major:minor as one)

src/hotspot/os/linux/osContainer_linux.cpp line 78:

> 76:   const char *reason;
> 77:   bool any_mem_cpu_limit_present = false;
> 78:   bool ctrl_ro = cgroup_subsystem->is_containerized();

nit: naming? what does ctrl mean in this case? Maybe use "cgroup_is_containerized"?

src/java.base/share/classes/sun/launcher/LauncherHelper.java line 375:

> 373:         if (!c.isContainerized()) {
> 374:             ostream.println(INDENT + "System not containerized.");
> 375:             return;

Why return here? Would this not cut the output short in the non-containerized case?

And if this not intended, the not-containerized-`-XshowSettings:system` test below should test and catch this (e.g. scan for CPU set)

-------------

PR Review: https://git.openjdk.org/jdk/pull/18201#pullrequestreview-1999328503
PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1564182879
PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567756663
PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567774124
PR Review Comment: https://git.openjdk.org/jdk/pull/18201#discussion_r1567779248


More information about the core-libs-dev mailing list