RFR: 8245543: Cgroups: Incorrect detection logic on some systems (still reproducible) [v5]
Severin Gehwolf
sgehwolf at openjdk.java.net
Thu Oct 8 12:51:54 UTC 2020
> An issue similar to [JDK-8239559](https://bugs.openjdk.java.net/browse/JDK-8239559) has been discovered. On the
> affected system, `/proc/self/mountinfo` contains a line such as this one:
>
> 35 26 0:26 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime - cgroup systemd rw,name=systemd
>
>
> Note that `/proc/cgroups` looks like this:
>
> #subsys_name hierarchy num_cgroups enabled
> cpuset 0 1 1
> cpu 0 1 1
> cpuacct 0 1 1
> blkio 0 1 1
> memory 0 1 1
> devices 0 1 1
> freezer 0 1 1
> net_cls 0 1 1
> perf_event 0 1 1
> net_prio 0 1 1
> hugetlb 0 1 1
> pids 0 1 1
>
> This is in fact a cgroups v1 system with systemd put into a separate namespace via FS type `cgroup`. That mountinfo
> line confuses the cgroups version detection heuristic. It only looked whether or not there is **any** entry in
> mountinfo of FS-type `cgroup` . That's wrong. A better heuristic would be looking at controllers we care about: `cpu`,
> `cpuacct`, `memory`, `cpuset` for hotspot and some more for the Java side. The proposed fix on the hotspot side is to
> set `any_cgroup_mounts_found` to true only if one of those controllers show up in mountinfo. Otherwise, we know it's
> cgroup v2 due to the zero hierarchy ids. The fix on the Java side is similar. For the Java side the proposal is also
> to do the parsing of the cgroup files in the factory now (single pass of files). No longer in the version specific
> objects. In order to not regress here, I've added more tests.
> **Testing**
>
> - [x] Linux x86_64 container tests on cgroup v1 (fastdebug)
> - [x] Linux x86_64 container tests on cgroup v2 (fastdebug)
> - [x] Added regression test which fails prior the patch and passes after
> - [x] Submit testing
Severin Gehwolf has refreshed the contents of this pull request, and previous commits have been removed. The
incremental views will show differences compared to the previous content of the PR. The pull request contains one new
commit since the last revision:
8245543: Cgroups: Incorrect detection logic on some systems (still reproducible)
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/485/files
- new: https://git.openjdk.java.net/jdk/pull/485/files/5e2ca2dc..e72ac219
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=485&range=04
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=485&range=03-04
Stats: 514 lines in 5 files changed: 159 ins; 231 del; 124 mod
Patch: https://git.openjdk.java.net/jdk/pull/485.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/485/head:pull/485
PR: https://git.openjdk.java.net/jdk/pull/485
More information about the core-libs-dev
mailing list