RFR: 8239559: Cgroups: Incorrect detection logic on some systems
Severin Gehwolf
sgehwolf at redhat.com
Mon Feb 24 15:42:44 UTC 2020
On Mon, 2020-02-24 at 10:28 -0500, Bob Vandette wrote:
> > > If you don’t have access to the information required to get metrics, I just assumed that
> > > you would return NULL in CgroupSubsystemFactory.create() rather than making the
> > > assumption that it works only to fail later.
> >
> > You are right. It makes little sense to continue in that case. Updated
> > webrev:
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239559/02/webrev/
>
> Looks good.
Thanks for the review.
I still need a *R*eviewer. Matthias, would you be willing to?
Thanks,
Severin
> > > Alternatively, we could consider assuming the mount point is /sys/fs/cgroup for cgroupv1 in
> > > the case you are trying to support. This would involve using /proc/self/cgroup to get the list
> > > of controllers and then use that list to call createSubSystemController and
> > > setSubSystemControllerPath with the default path.
> > >
> > > I think we need to understand the extent of the problem on these older systems before
> > > deciding a course of action. Do we see the same empty mountinfo file in a docker container
> > > running on these older systems or is this just a host issue? If docker containers work fine, then
> > > I wouldn’t bother trying to make this work.
> >
> > That's the thing. I don't think any of those older systems support
> > docker in the first place.
>
> If that’s the case then you are doing the right thing.
>
> Bob.
>
>
> > Thanks,
> > Severin
> >
> > > Bob.
> > >
> > > > > On Thu, 2020-02-20 at 14:50 +0000, Baesken, Matthias wrote:
> > > > > > Hi Severin,
> > > > > >
> > > > > > grep cgroup /proc/self/mountinfo
> > > > > >
> > > > > > returns nothing.
> > > > > >
> > > > > > Best Regards, Matthias
> > > > > >
> > > > >
> > > > > Assuming your fix is correct, don’t we also need to apply the same change to the hotspot source cgroupSubsystem_linux.cpp?
> > > >
> > > > Yes, tracked with JDK-8239785.
> > > >
> > > > Thanks,
> > > > Severin
> > > >
> > > > > Bob;
> > > > >
> > > > >
> > > > > > On Feb 21, 2020, at 8:32 AM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Could I please get a review of this fix to the detection heuristic of
> > > > > > cgroup v1 vs cgroup v2? Matthias (in CC) discovered that on some old
> > > > > > systems the JDK Metrics code throws InternalError caused by wrong
> > > > > > detection logic when Metrics are being created on Linux.
> > > > > >
> > > > > > The reason for this is that hierarchy IDs of 0 in /proc/cgroups is
> > > > > > being used as a heuristic to detect cgroups v2 systems. Apparently some
> > > > > > old systems like RHEL 6 and SLES 11 have no cgroups controllers
> > > > > > mounted, thus, triggering a false positive.
> > > > > >
> > > > > > The fix is to also look at /proc/self/mountinfo and correct logic in
> > > > > > this case.
> > > > > >
> > > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8239559
> > > > > > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239559/01/webrev/
> > > > > >
> > > > > > Testing: docker/cgroups tests on hybrid (cgroups v1) and unified
> > > > > > hierarchy (cgroups v2). New regression test. Looks good here.
> > > > > >
> > > > > > Unfortunately, I wasn't able to reproduce this on an actual affected
> > > > > > system. I somewhat reproduced via the derived regression test based on
> > > > > > data from reporters. I'd appreciate any testing on systems where this
> > > > > > reproduces.
> > > > > >
> > > > > > Thanks,
> > > > > > Severin
> > > > > >
More information about the core-libs-dev
mailing list