RFR: 8347129: cpuset cgroups controller is required for no good reason

Severin Gehwolf sgehwolf at openjdk.org
Tue Jan 14 17:00:41 UTC 2025


On Tue, 14 Jan 2025 15:09:26 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:

>> Please review this small change to make the `cpuset` cgroups controller optional as far as the JDK is concerned. The rationale is that:
>> 
>> - Some distributions now don't have it enabled by default (e.g. Fedora 41), which breaks automatic container detection logic.
>> - CPU limits enforced with `--cpuset-cpus` is reflected with the `sched_getaffinity` system call and, thus, the controller doesn't need to be mandatory. The current failure to detect the controller results in no container limits being detected with is bad.
>> 
>> The fix is rather simple. Make `cpuset` controller look-up from `/proc/cgroups` optional (like the `pids` controller) and continue. `OSContainer::active_processor_count()` still behaves as before (should there be a `cpuset` limit) as it's covered by `os::Linux::active_processor_count()` which serves as the upper bound of other container cpu limits.
>> 
>> While at it, I've also fixed the logging bug by re-arranging the `const char*` values. `cpuset` is index `0`, `cpu` is index `1`.
>> 
>> Testing:
>> - [x] GHA
>> - [x] Linux container tests in `test/hotspot/jtreg/containers` on Linux cgroups v1 and cgroups v2 as well as on an affected system (F41 on cg v2). All pass.
>> 
>> Thoughts?
>
> Thanks for the review!

> @jerboaa Should this block be updated to not return false as the cpuset is now an optional subsystem.
> 
> https://github.com/openjdk/jdk/blob/56c780078f84a2571b779d90f528d5bcab2a9dfd/src/hotspot/os/linux/cgroupSubsystem_linux.cpp#L498-L503

@ashu-mehra Thanks for looking at it!

I'm in two minds about this. Why? a) cgroups v1 is becoming increasingly legacy b) the chance that an old v1 system would no longer enable cpusets controller seems unlikely. So while it seems consistent to remove this for cg v1 too, it's unlikely that such a system exists. For example Fedora 41 where this has been observed no longer supports cg v1. I'd expect for other distros to follow. New distros will likely only support cg v2.

For those reasons I'd be inclined to keep the patch as-is as it would be less risky for the cg v1 version.

Thoughts?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23037#issuecomment-2590568050


More information about the hotspot-runtime-dev mailing list