RFR: 8245543: Cgroups: Incorrect detection logic on some systems (still reproducible) [v6]
Aleksey Shipilev
shade at openjdk.java.net
Fri Oct 9 09:27:20 UTC 2020
On Thu, 8 Oct 2020 13:38:59 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:
>> An issue similar to [JDK-8239559](https://bugs.openjdk.java.net/browse/JDK-8239559) has been discovered. On the
>> affected system, `/proc/self/mountinfo` contains a line such as this one:
>>
>> 35 26 0:26 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup systemd rw,name=systemd
>>
>>
>> Note that `/proc/cgroups` looks like this:
>>
>> #subsys_name hierarchy num_cgroups enabled
>> cpuset 0 1 1
>> cpu 0 1 1
>> cpuacct 0 1 1
>> blkio 0 1 1
>> memory 0 1 1
>> devices 0 1 1
>> freezer 0 1 1
>> net_cls 0 1 1
>> perf_event 0 1 1
>> net_prio 0 1 1
>> hugetlb 0 1 1
>> pids 0 1 1
>>
>> This is in fact a cgroups v1 system with systemd put into a separate namespace via FS type `cgroup`. That mountinfo
>> line confuses the cgroups version detection heuristic. It only looked whether or not there is **any** entry in
>> mountinfo of FS-type `cgroup` . That's wrong. A better heuristic would be looking at controllers we care about: `cpu`,
>> `cpuacct`, `memory`, `cpuset` for hotspot and some more for the Java side. The proposed fix on the hotspot side is to
>> set `any_cgroup_mounts_found` to true only if one of those controllers show up in mountinfo. Otherwise, we know it's
>> cgroup v2 due to the zero hierarchy ids. The fix on the Java side is similar. For the Java side the proposal is also
>> to do the parsing of the cgroup files in the factory now (single pass of files). No longer in the version specific
>> objects. In order to not regress here, I've added more tests.
>> **Testing**
>>
>> - [x] Linux x86_64 container tests on cgroup v1 (fastdebug)
>> - [x] Linux x86_64 container tests on cgroup v2 (fastdebug)
>> - [x] Added regression test which fails prior the patch and passes after
>> - [x] Submit testing
>
> Severin Gehwolf has updated the pull request incrementally with one additional commit since the last revision:
>
> Return false in case of no pattern match
This looks fine to me. Consider a minor nit below.
src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java line 183:
> 181: }
> 182: }
> 183: return relevantControllerFound;
If I am reading the code right, then `relevantControllerFound` can be replaced with just `return true` or `return
false`?
-------------
Marked as reviewed by shade (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/485
More information about the serviceability-dev
mailing list