RFR(S): 8239785: Cgroups: Incorrect detection logic on old systems in hotspot

Bob Vandette bob.vandette at oracle.com
Wed Mar 4 20:39:30 UTC 2020


> On Mar 4, 2020, at 1:59 PM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
> 
> Hi Bob,
> 
> Updated webrev:
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/05/webrev/
> 
> Testing: hotspot docker tests on Linux fastdebug/release cgroup v1 and
> cgroup v2. New regression test. jdk submit. All pass.
> 
> On Fri, 2020-02-28 at 15:08 -0500, Bob Vandette wrote:
>> It still looks like the _mount_path can be corrupted if both cgroupv1 and cgroupv2 are mounted.
>> You’ll only initialize the _mount_path if cgroupv2 is detected in the cgroups file but if cgroupv1
>> is also mounted, you’ll still clobber the _mount_path.
> 
> How so? When we start processing /proc/self/mountinfo on line 170, then
> we have already looked at /proc/cgroups, thus is_cgroupsV2 is set to
> true (A) or false (B).
> 
> Cases A:
> 1. We are on an old system with *no* cgroup controllers (v1 or v2)
>    mounted (this bug). We are fine to continue as no cgroups entry
>    will be found in /proc/self/mountinfo. Thus, no clobbering.
> 2. We are on a true unified hierarchy only system. In that case
>    the sscanf for v1 would match (early match), but 'memory' et.al
>    controllers would not be found. Thus, no clobbering.
> 
> Cases B:
> 1. Hybrid hierarchy: controllers mounted via cgroup v1, but
>    cgroup v2 mounted in addition for system use. is_cgroupV2 will
>    prevent the cgroup v2 controller path lookup logic to be entered.
>    No clobbering.
> 2. cgroup v1 only hierarchy (legacy): Same logic applies. V2 setting
>    of mount paths won't be entered since is_cgroupV2 == false.
>    No clobbering.
> 
> What am I missing? FWIW, I've played around with the idea of keeping
> cgroup v1 and cgroup v2 mount paths separate[1]. If that's preferred,
> happy to go this route.

The case that I think is a problem is …

cgroupv2 is detected in /proc/cgroups so is_cgroupv2 is true
both cgroupv1 and cgroupv2 devices are mounted

At line cgroupSubsystem_linux.cpp:292 in your new webrev, you process the cgroupv1
devices and clobber _mount_path.

What keeps this from occurring?

> 
> While thinking about this issue here, I've decided to actually write a
> test (using WhiteBox) so that all these cases are covered and will
> continue to work. Running the test, CgroupSubsystemFactory, with
> fastdebug proves that there is no clobbering if that matters. It's also
> a way to reproduce the issue. So we have a regression test now too
> (testCgroupv1NoMounts). Line 325-332 in [2] is the actual fix. Without
> it, the test fails.
> 
>> Maybe you should just do the cgroupv2 processing if /proc/cgroups has
>> cgroupv2 enabled and if that fails continue looking for cgroup v1
>> mounts.
> 
> FWIW, I've amended the cgroup v1 logic to also check for the file
> system type. Which needs to be 'cgroup' (not 'cgroup2') for cgroup v1.
> This isn't strictly needed but more accurately reflects the intent IMO.

Sounds good.

> 
>> 
>> Do you really need to do this?  The BSS section should be zeroed.
>> 
>> 55   for (int i = 0; i < CG_INFO_LENGTH; i++) {
>>  56     cg_infos[i]._name = NULL;
>>  57     cg_infos[i]._cgroup_path = NULL;
>>  58     cg_infos[i]._root_mount_path = NULL;
>>  59     cg_infos[i]._mount_path = NULL;
>>  60   }
> 
> I believe so. CgroupInfo array is stack allocated and we should
> initialize values properly. On the other hand, we should do that in the
> constructor. Done now.

You are correct.  I was only looking at the diffs and not the full source.

> 
>> You don’t need to check for NULL in cleanup, os::free can handle NULL
>> pointers.
>> 
>> 339 void CgroupSubsystemFactory::cleanup(CgroupInfo* cg_infos) {
>> 340   assert(cg_infos != NULL, "Invariant");
>> 341   for (int i = 0; i < CG_INFO_LENGTH; i++) {
>> 342     if (cg_infos[i]._name != NULL) {
>> 343       os::free(cg_infos[i]._name);
>> 344     }
>> 345     if (cg_infos[i]._cgroup_path != NULL) {
>> 346       os::free(cg_infos[i]._cgroup_path);
>> 347     }
>> 348     if (cg_infos[i]._root_mount_path != NULL) {
>> 349       os::free(cg_infos[i]._root_mount_path);
>> 350     }
>> 351     if (cg_infos[i]._mount_path != NULL) {
>> 352       os::free(cg_infos[i]._mount_path);
>> 353     }
>> 354   }
>> 355 }
> 
> Fixed.
> 
> Let me know what you think about the updated webrev.
> 
> Thanks,
> Severin
> 
> [1] http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/02/webrev/
> [2] http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/05/webrev/src/hotspot/os/linux/cgroupSubsystem_linux.cpp.html
> 
>> Bob.
>> 
>> 
>> 
>>> On Feb 28, 2020, at 11:42 AM, Severin Gehwolf <sgehwolf at redhat.com>
>>> wrote:
>>> 
>>> Hi Bob,
>>> 
>>> Thanks for the review!
>>> 
>>> On Tue, 2020-02-25 at 15:08 -0500, Bob Vandette wrote:
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/01/webrev/src/hotspot/os/linux/cgroupSubsystem_linux.cpp.sdiff.html
>>>> 
>>>> If both cgroupv1 and v2 controllers are mounted, you might have
>>>> some problems.  You might want to only look for
>>>> cgroup2 mounts if is_cgroupV2 is true otherwise only look for
>>>> cgroup mounts.
>>> 
>>> OK.
>>> 
>>>> 194           cg_infos[i]._mount_path =
>>>> os::strdup(tmp_mount_point);
>>>> This will stomp the pointer:
>>>> 213           cg_infos[memory_idx]._mount_path =
>>>> os::strdup(tmpmount);
>>> 
>>> You are right. If we are on a hybrid hierarchy and for some reason
>>> the
>>> cgroup v2 mount shows up *after* cgroup mounts in
>>> /proc/self/mountinfo,
>>> we'd get the mount path wrong. So ordering matters! The reason this
>>> worked for me on hybrid is that cgroup2 mounts come *before* cgroup
>>> mounts on the system I've tested.
>>> 
>>>> You should free all strdup’d pointers in cg_infos in places where
>>>> you fail and return NULL.
>>>> 
>>>> 81       cg_infos[memory_idx]._name = os::strdup(name);
>>>> 213           cg_infos[memory_idx]._mount_path =
>>>> os::strdup(tmpmount);
>>>> 214           cg_infos[memory_idx]._root_mount_path =
>>>> os::strdup(tmproot);
>>> 
>>> Done now. More thoughts?
>>> 
>>> Updated webrev: 
>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/03/webrev/
>>> 
>>> Thanks,
>>> Severin
>>> 
>>>>> On Feb 25, 2020, at 2:22 PM, Severin Gehwolf <
>>>>> sgehwolf at redhat.com> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Could I please get reviews of this cgroup-related patch? After
>>>>> JDK-
>>>>> 8230305 old systems with no mounted cgroup controllers would
>>>>> get
>>>>> detected as cgroups v2 (wrong). Then, when initializing the v2
>>>>> subsystem, it would fail. The trace about cgroupv2 mount point
>>>>> not
>>>>> found is misleading in this case. While the outcome is similar
>>>>> pre/post
>>>>> patch (NULL cgroup subsystem), I'd like to be explicit about
>>>>> this case.
>>>>> 
>>>>> The suggested fix is to look at /proc/self/mountinfo in order
>>>>> to
>>>>> correctly detect whether this is cgroup v2 or cgroup v1 with no
>>>>> mounted
>>>>> controllers. In the latter case we just stop initialization as
>>>>> we'd
>>>>> fail later in cgroupv1 code anyway. This aligns hotspot code
>>>>> with core-
>>>>> libs after JDK-8239559.
>>>>> 
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8239785
>>>>> webrev: 
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/01/webrev/
>>>>> 
>>>>> Testing: jdk-submit, hotspot docker tests on cgroup v1 and
>>>>> cgroup v2. All pass.
>>>>> 
>>>>> Thoughts?
>>>>> 
>>>>> Thanks,
>>>>> Severin
>>>>> 
>> 
>> 
> 
> 
> 
> 



More information about the hotspot-dev mailing list