RFR: 8239559: Cgroups: Incorrect detection logic on some systems

Severin Gehwolf sgehwolf at redhat.com
Tue Feb 25 14:17:57 UTC 2020


Hi Matthias,

On Tue, 2020-02-25 at 14:01 +0000, Baesken, Matthias wrote:
> Hi Severin,   8239559/02   looks  generally good .

Thanks for the review!

> However I wonder about this :
> 
> I have a SLES15 aarch64 system with these settings :
> 
>  more /proc/cgroups 
> #subsys_name	hierarchy	num_cgroups	enabled
> cpuset	6	1	1
> cpu	8	1	1
> cpuacct	8	1	1
> blkio	5	1	1
> memory	9	1	1
> devices	2	91	1
> freezer	4	1	1
> net_cls	3	1	1
> perf_event	11	1	1
> net_prio	3	1	1
> hugetlb	10	1	1
> pids	12	105	1
> rdma	7	1	1
> 
> 
> (so no hierarchy  0 )
> 
> However these information indicates the cgroup2 is supported :
> 
> fgrep cgroup2 /proc/self/mountinfo
> 29 28 0:25 / /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime shared:5 - cgroup2 cgroup rw
> 
> grep cgroup /proc/filesystems
> nodev	cgroup
> nodev	cgroup2

What you are seeing here is a hybrid system[1]. My workstation is
hybrid as well. Legacy and hybrid systems are being detected as cgroup
v1. A hybrid systems mostly behaves like a cgroup v1 system. All the
controllers are mounted via cgroup v1. If if was detected as cgroup v2
(unified) most of the imposed memory/cpu limits wouldn't be accounted
for.

> 
> But the comment    in
> 
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239559/02/webrev/src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java.frames.html
> 
> 
> says 
> 
>  102         // For cgroups v2 all controllers need to have zero hierarchy id
>  103         // and /proc/self/mountinfo needs to have at least one cgroup filesystem
>  104         // mounted.
> 
> 
> Should this comment be adjusted ?
> On the system above  we  have no 0  in  /proc/cgroups   however  it seems to me the system supports  cgroug v2 ?

The comment is still correct. There is a difference between a true
cgroup v2 only system and a hybrid system like the one you are seing.
We should only detect true cgroup v2 systems as such.

Does that make sense?

Thanks,
Severin

[1] https://lists.freedesktop.org/archives/systemd-devel/2017-November/039751.html

> 
> Best regards, Matthias
> 
> 
> 
> > > I still need a *R*eviewer. Matthias, would you be willing to?
> > 
> > Hi Severin,   I can look into it tomorrow  .
> > 
> > Best regards, Matthias
> > 
> > 
> > 
> > > On Mon, 2020-02-24 at 10:28 -0500, Bob Vandette wrote:
> > > > > > If you don’t have access to the information required to get
> > > > > > metrics, I
> > > just assumed that
> > > > > > you would return NULL in CgroupSubsystemFactory.create()
> > > > > > rather
> > > than making the
> > > > > > assumption that it works only to fail later.
> > > > > 
> > > > > You are right. It makes little sense to continue in that
> > > > > case. Updated
> > > > > webrev:
> > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-
> > > 8239559/02/webrev/
> > > > Looks good.
> > > 
> > > Thanks for the review.
> > > 
> > > I still need a *R*eviewer. Matthias, would you be willing to?
> > > 
> > > Thanks,
> > > Severin
> > > 



More information about the core-libs-dev mailing list