RFR: 8239559: Cgroups v2: Incorrect detection logic on some systems

Bob Vandette bob.vandette at oracle.com
Fri Feb 21 15:12:16 UTC 2020


> On Feb 21, 2020, at 9:30 AM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
> 
> Hi Bob,
> 
> On Fri, 2020-02-21 at 09:11 -0500, Bob Vandette wrote:
>> Severin,
>> 
>> Don’t we need the contents of /proc/self/mountinfo in order to construct the path to the cgroup controllers?
> 
> There is only one for unified (cgroups v2), but yes it's beeing used.
> See CgroupV2Subsystem.initSubsystem() and
> CgroupV1Subsystem.initSubsystem(). For affected systems, no controllers
> are mounted, so the effect will be null Metrics, as before JDK-8231111. 
> Maybe I didn't understand the question, sorry.

If you don’t have access to the information required to get metrics, I just assumed that
you would return NULL in CgroupSubsystemFactory.create() rather than making the
assumption that it works only to fail later.

Alternatively, we could consider assuming the mount point is /sys/fs/cgroup for cgroupv1 in
the case you are trying to support.  This would involve using /proc/self/cgroup to get the list
of controllers and then use that list to call createSubSystemController and
setSubSystemControllerPath with the default path.

I think we need to understand the extent of the problem on these older systems before 
deciding a course of action.  Do we see the same empty mountinfo file in a docker container
running on these older systems or is this just a host issue?  If docker containers work fine, then
I wouldn’t bother trying to make this work.

Bob.

> 
>> On Thu, 2020-02-20 at 14:50 +0000, Baesken, Matthias wrote:
>>> Hi  Severin,
>>> 
>>> grep cgroup /proc/self/mountinfo 
>>> 
>>> returns  nothing.
>>> 
>>> Best Regards, Matthias
>>> 
>> 
>> Assuming your fix is correct, don’t we also need to apply the same change to the hotspot source cgroupSubsystem_linux.cpp?
> 
> Yes, tracked with JDK-8239785.
> 
> Thanks,
> Severin
> 
>> Bob;
>> 
>> 
>>> On Feb 21, 2020, at 8:32 AM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Could I please get a review of this fix to the detection heuristic of
>>> cgroup v1 vs cgroup v2? Matthias (in CC) discovered that on some old
>>> systems the JDK Metrics code throws InternalError caused by wrong
>>> detection logic when Metrics are being created on Linux.
>>> 
>>> The reason for this is that hierarchy IDs of 0 in /proc/cgroups is
>>> being used as a heuristic to detect cgroups v2 systems. Apparently some
>>> old systems like RHEL 6 and SLES 11 have no cgroups controllers
>>> mounted, thus, triggering a false positive.
>>> 
>>> The fix is to also look at /proc/self/mountinfo and correct logic in
>>> this case.
>>> 
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8239559
>>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239559/01/webrev/
>>> 
>>> Testing: docker/cgroups tests on hybrid (cgroups v1) and unified
>>> hierarchy (cgroups v2). New regression test. Looks good here.
>>> 
>>> Unfortunately, I wasn't able to reproduce this on an actual affected
>>> system. I somewhat reproduced via the derived regression test based on
>>> data from reporters. I'd appreciate any testing on systems where this
>>> reproduces.
>>> 
>>> Thanks,
>>> Severin
>>> 
> 



More information about the core-libs-dev mailing list