RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path

Sergey Chernyshev schernyshev at openjdk.org
Thu Oct 31 15:04:57 UTC 2024


Cgroup V1 subsustem fails to initialize mounted controllers properly in certain cases, that may lead to controllers left undetected/inactive. We observed the behavior in CloudFoundry deployments, it affects also host systems.

The relevant /proc/self/mountinfo line is


2207 2196 0:43 /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - cgroup cgroup rw,cpu,cpuacct


/proc/self/cgroup:


11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c


Here, Java runs inside containerized process that is being moved cgroups due to load balancing.

Let's examine the condition at line 64 here https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
It is always FALSE and the branch is never taken. The issue was spotted earlier by @jerboaa in [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 

The original logic was intended to find the common prefix of `_root`and `cgroup_path` and concatenate the remaining suffix to the `_mount_point` (lines 67-68). That could lead to the following results: 

Example input

_root = "/a"
cgroup_path = "/a/b"
_mount_point = "/sys/fs/cgroup/cpu,cpuacct"


result _path

"/sys/fs/cgroup/cpu,cpuacct/b"


Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control groups states:


...
       /proc/pid/cgroup (since Linux 2.6.24)
              This file describes control groups to which the process
              with the corresponding PID belongs.  The displayed
              information differs for cgroups version 1 and version 2
              hierarchies.
              For each cgroup hierarchy of which the process is a
              member, there is one entry containing three colon-
              separated fields:

                  hierarchy-ID:controller-list:cgroup-path

              For example:

                  5:cpuacct,cpu,cpuset:/daemons
...
              [3]  This field contains the pathname of the control group
                   in the hierarchy to which the process belongs. This
                   pathname is relative to the mount point of the
                   hierarchy.


This explicitly states the "pathname is relative to the mount point of the hierarchy". Hence, the correct result could have been


/sys/fs/cgroup/cpu,cpuacct/a/b


However, if Java runs in a container, `/proc/self/cgroup` and `/proc/self/mountinfo` are mapped (read-only) from host, because docker uses `--cgroupns=host` by default in cgroup v1 hosts. Then `_root` and `cgroup_path` belong to the host and do not exist in the container. In containers Java must fall back to `_mount_point` of the corresponding cgroup controller.

When `--cgroupns=private` is used, `_root` and `cgroup_path` are always equal to `/`.

In hosts, the `cgroup_path` should always be added to the mount point, no matter how it compares to the `_root`.

The patch uses the result of `is_containerized()` to select the correct path. It is suggested to change the semantics of `is_read_only()` so that it returns the combined read-only flag for all mounted controllers. Currently the only usage of `_read_only` flag is to determine that V1 subsystem `is_containerized()`. `_read_only` flags are available in advance, before initialization of any CgroupV1SubsystemController objects.

The Java side is updated to follow the same logic.

-------------

Commit messages:
 - 8343191: Cgroup v1 subsystem fails to set subsystem path

Changes: https://git.openjdk.org/jdk/pull/21808/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21808&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8343191
  Stats: 229 lines in 10 files changed: 157 ins; 30 del; 42 mod
  Patch: https://git.openjdk.org/jdk/pull/21808.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21808/head:pull/21808

PR: https://git.openjdk.org/jdk/pull/21808


More information about the core-libs-dev mailing list