RFR: 8349988: Change cgroup version detection logic to not depend on /proc/cgroups

Severin Gehwolf sgehwolf at openjdk.org
Fri Feb 28 10:23:57 UTC 2025


On Fri, 28 Feb 2025 07:43:25 GMT, Thomas Fitzsimmons <duke at openjdk.org> wrote:

>> src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 360:
>> 
>>> 358:     for (int i = 0; i < CG_INFO_LENGTH; i++) {
>>> 359:       // pids and cpuset controllers are optional. All other controllers are required
>>> 360:       if (i != PIDS_IDX && i != CPUSET_IDX) {
>> 
>> Same comment for `cpuset` controller. Keep it required for cg v1.
>
> Hmm, this logic was there before my changes; I am pretty sure I did keep `cpuset` required for `cgroups v1`.
> 
> To attempt to confirm my patch had not changed the `cgroups v1` `cpuset` optionality logic, I wrote a test case for "cgroups v1, cpuset not present in /proc/cgroups".
> 
> (To avoid assertion failures I created fake `/proc/self/cgroup` and `/proc/self/mountinfo` files that also do not have `cpuset` lines.)
> 
> I adapted the test case to before my patch:
> 
> https://github.com/fitzsim/jdk/commit/cab0000598b522d38865452575f95ed368b5935b
> 
> and after my patch:
> 
> https://github.com/fitzsim/jdk/commit/d8537baed5d56b5f36ce628b8c180ce18ce6f6e6
> 
> and it passes in both cases.  i.e., for `cgroups v1`, `cpuset` is required, so `determine_type` produces `INVALID_CGROUPS_V1 (4)`, from this logic:
> 
> 
>   if (!cg_infos[CPUSET_IDX]._data_complete) {
>     log_debug(os, container)("Required cgroup v1 cpuset subsystem not found");
>     cleanup(cg_infos);
>     *flags = INVALID_CGROUPS_V1;
>     return false;
>   }
> 
> 
> The test fails in both cases if I remove ` && i != CPUSET_IDX`, with  `determine_type` returning `INVALID_CGROUPS_GENERIC (6)`:
> 
> 
> Required cpuset controller missing in /proc/cgroups. Invalid. expected: 4 but was: 6
> 
> 
> due to this block (which my patch left alone):
> 
> 
>   if (!all_required_controllers_enabled) {
>     // one or more required controllers disabled, disable container support
>     log_debug(os, container)("One or more required controllers disabled at kernel level.");
>     cleanup(cg_infos);
>     *flags = INVALID_CGROUPS_GENERIC;
>     return false;
>   }
> 
> 
> No existing test checks this logic as far as I can tell, so if you think the above makes sense, I can add this test to this pull request.

You are right, that if any of the required controllers aren't enabled at the kernel level we fail with `NVALID_CGROUPS_GENERIC`. However, the `if` condition is within the cgroups v1 branch while it used to be outside a version specific branch.

Also note that `/proc/cgroup` containing (last digit `0`, indicating the enabled flag):


cpuset  3   1   0\n


... is semantically equivalent to it being missing entirely from `proc/cgroup`. But keeping `if (i != PIDS_IDX && i != CPUSET_IDX) {` above, would keep `all_required_controllers_enabled == true` which is not correct. Yes, we should keep/add a test like you suggest, but amend the patch to something like this: https://github.com/jerboaa/jdk/commit/26f765db9fef6f1d7be79452da701987274117c5

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23811#discussion_r1975171365


More information about the hotspot-dev mailing list