RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v8]

Sergey Chernyshev schernyshev at openjdk.org
Tue Dec 3 10:09:41 UTC 2024


On Sat, 30 Nov 2024 00:30:02 GMT, Sergey Chernyshev <schernyshev at openjdk.org> wrote:

>> Cgroup V1 subsustem fails to initialize mounted controllers properly in certain cases, that may lead to controllers left undetected/inactive. We observed the behavior in CloudFoundry deployments, it affects also host systems.
>> 
>> The relevant /proc/self/mountinfo line is
>> 
>> 
>> 2207 2196 0:43 /system.slice/garden.service/garden/good/2f57368b-0eda-4e52-64d8-af5c /sys/fs/cgroup/cpu,cpuacct ro,nosuid,nodev,noexec,relatime master:25 - cgroup cgroup rw,cpu,cpuacct
>> 
>> 
>> /proc/self/cgroup:
>> 
>> 
>> 11:cpu,cpuacct:/system.slice/garden.service/garden/bad/2f57368b-0eda-4e52-64d8-af5c
>> 
>> 
>> Here, Java runs inside containerized process that is being moved cgroups due to load balancing.
>> 
>> Let's examine the condition at line 64 here https://github.com/openjdk/jdk/blob/55a7cf14453b6cd1de91362927b2fa63cba400a1/src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp#L59-L72
>> It is always FALSE and the branch is never taken. The issue was spotted earlier by @jerboaa in [JDK-8288019](https://bugs.openjdk.org/browse/JDK-8288019). 
>> 
>> The original logic was intended to find the common prefix of `_root`and `cgroup_path` and concatenate the remaining suffix to the `_mount_point` (lines 67-68). That could lead to the following results: 
>> 
>> Example input
>> 
>> _root = "/a"
>> cgroup_path = "/a/b"
>> _mount_point = "/sys/fs/cgroup/cpu,cpuacct"
>> 
>> 
>> result _path
>> 
>> "/sys/fs/cgroup/cpu,cpuacct/b"
>> 
>> 
>> Here, cgroup_path comes from /proc/self/cgroup 3rd column. The man page (https://man7.org/linux/man-pages/man7/cgroups.7.html#NOTES) for control groups states:
>> 
>> 
>> ...
>>        /proc/pid/cgroup (since Linux 2.6.24)
>>               This file describes control groups to which the process
>>               with the corresponding PID belongs.  The displayed
>>               information differs for cgroups version 1 and version 2
>>               hierarchies.
>>               For each cgroup hierarchy of which the process is a
>>               member, there is one entry containing three colon-
>>               separated fields:
>> 
>>                   hierarchy-ID:controller-list:cgroup-path
>> 
>>               For example:
>> 
>>                   5:cpuacct,cpu,cpuset:/daemons
>> ...
>>               [3]  This field contains the pathname of the control group
>>                    in the hierarchy to which the process belongs. This
>>                    pathname is relative to the mount point of the
>>                    hierarchy.
>> 
>> 
>> This explicitly states the "pathname is relative t...
>
> Sergey Chernyshev has updated the pull request incrementally with three additional commits since the last revision:
> 
>  - update cgroup v1 in metrics
>  - Apply suggestions from code review
>    
>    Co-authored-by: Severin Gehwolf <jerboaa at gmail.com>
>  - updated test (path is reduced)

Here's the summary of the latest state of the PR. The updated code
- special cases the condition when `_root` is `/`, and `cgroup_path` includes `../`. The condition appears in containers when a process's cgroup is moved to a supergroup. Then the cgroup files are mapped inside the container at the controller's mount point. The reason for this - path adjustment will always fail with cgroup path prefix `../`;
- calls `stat()` on cgroup path to make sure the directory exists - only when `_root` != `/` and `_root` != `cgroup_path`. This issue is specific to cgroup v1 containers, where /proc/self/cgroup is from host, cgroup files are mapped at controller's mount point, the mapping may include subgroups that need to be walked through to locate the smallest limits ;
- sets Java Metrics in alignment with hotspot logic ;
- fixes an NPE in Java Metrics ;
- fixes an uninitialized path issue in hotspot / cgroup v1 subsystem when `_root` != `/` and `_root` != `cgroup_path`;
- fixes a logical error with `lowest_limit` in the path adjustment in `CgroupUtil::adjust_controller()` methods ;
- adds container tests on subgroups in hotspot and metrics.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2514099036


More information about the core-libs-dev mailing list