RFR(S): 8239785: Cgroups: Incorrect detection logic on old systems in hotspot
Bob Vandette
bob.vandette at oracle.com
Wed Mar 4 20:39:30 UTC 2020
> On Mar 4, 2020, at 1:59 PM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
>
> Hi Bob,
>
> Updated webrev:
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/05/webrev/
>
> Testing: hotspot docker tests on Linux fastdebug/release cgroup v1 and
> cgroup v2. New regression test. jdk submit. All pass.
>
> On Fri, 2020-02-28 at 15:08 -0500, Bob Vandette wrote:
>> It still looks like the _mount_path can be corrupted if both cgroupv1 and cgroupv2 are mounted.
>> You’ll only initialize the _mount_path if cgroupv2 is detected in the cgroups file but if cgroupv1
>> is also mounted, you’ll still clobber the _mount_path.
>
> How so? When we start processing /proc/self/mountinfo on line 170, then
> we have already looked at /proc/cgroups, thus is_cgroupsV2 is set to
> true (A) or false (B).
>
> Cases A:
> 1. We are on an old system with *no* cgroup controllers (v1 or v2)
> mounted (this bug). We are fine to continue as no cgroups entry
> will be found in /proc/self/mountinfo. Thus, no clobbering.
> 2. We are on a true unified hierarchy only system. In that case
> the sscanf for v1 would match (early match), but 'memory' et.al
> controllers would not be found. Thus, no clobbering.
>
> Cases B:
> 1. Hybrid hierarchy: controllers mounted via cgroup v1, but
> cgroup v2 mounted in addition for system use. is_cgroupV2 will
> prevent the cgroup v2 controller path lookup logic to be entered.
> No clobbering.
> 2. cgroup v1 only hierarchy (legacy): Same logic applies. V2 setting
> of mount paths won't be entered since is_cgroupV2 == false.
> No clobbering.
>
> What am I missing? FWIW, I've played around with the idea of keeping
> cgroup v1 and cgroup v2 mount paths separate[1]. If that's preferred,
> happy to go this route.
The case that I think is a problem is …
cgroupv2 is detected in /proc/cgroups so is_cgroupv2 is true
both cgroupv1 and cgroupv2 devices are mounted
At line cgroupSubsystem_linux.cpp:292 in your new webrev, you process the cgroupv1
devices and clobber _mount_path.
What keeps this from occurring?
>
> While thinking about this issue here, I've decided to actually write a
> test (using WhiteBox) so that all these cases are covered and will
> continue to work. Running the test, CgroupSubsystemFactory, with
> fastdebug proves that there is no clobbering if that matters. It's also
> a way to reproduce the issue. So we have a regression test now too
> (testCgroupv1NoMounts). Line 325-332 in [2] is the actual fix. Without
> it, the test fails.
>
>> Maybe you should just do the cgroupv2 processing if /proc/cgroups has
>> cgroupv2 enabled and if that fails continue looking for cgroup v1
>> mounts.
>
> FWIW, I've amended the cgroup v1 logic to also check for the file
> system type. Which needs to be 'cgroup' (not 'cgroup2') for cgroup v1.
> This isn't strictly needed but more accurately reflects the intent IMO.
Sounds good.
>
>>
>> Do you really need to do this? The BSS section should be zeroed.
>>
>> 55 for (int i = 0; i < CG_INFO_LENGTH; i++) {
>> 56 cg_infos[i]._name = NULL;
>> 57 cg_infos[i]._cgroup_path = NULL;
>> 58 cg_infos[i]._root_mount_path = NULL;
>> 59 cg_infos[i]._mount_path = NULL;
>> 60 }
>
> I believe so. CgroupInfo array is stack allocated and we should
> initialize values properly. On the other hand, we should do that in the
> constructor. Done now.
You are correct. I was only looking at the diffs and not the full source.
>
>> You don’t need to check for NULL in cleanup, os::free can handle NULL
>> pointers.
>>
>> 339 void CgroupSubsystemFactory::cleanup(CgroupInfo* cg_infos) {
>> 340 assert(cg_infos != NULL, "Invariant");
>> 341 for (int i = 0; i < CG_INFO_LENGTH; i++) {
>> 342 if (cg_infos[i]._name != NULL) {
>> 343 os::free(cg_infos[i]._name);
>> 344 }
>> 345 if (cg_infos[i]._cgroup_path != NULL) {
>> 346 os::free(cg_infos[i]._cgroup_path);
>> 347 }
>> 348 if (cg_infos[i]._root_mount_path != NULL) {
>> 349 os::free(cg_infos[i]._root_mount_path);
>> 350 }
>> 351 if (cg_infos[i]._mount_path != NULL) {
>> 352 os::free(cg_infos[i]._mount_path);
>> 353 }
>> 354 }
>> 355 }
>
> Fixed.
>
> Let me know what you think about the updated webrev.
>
> Thanks,
> Severin
>
> [1] http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/02/webrev/
> [2] http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/05/webrev/src/hotspot/os/linux/cgroupSubsystem_linux.cpp.html
>
>> Bob.
>>
>>
>>
>>> On Feb 28, 2020, at 11:42 AM, Severin Gehwolf <sgehwolf at redhat.com>
>>> wrote:
>>>
>>> Hi Bob,
>>>
>>> Thanks for the review!
>>>
>>> On Tue, 2020-02-25 at 15:08 -0500, Bob Vandette wrote:
>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/01/webrev/src/hotspot/os/linux/cgroupSubsystem_linux.cpp.sdiff.html
>>>>
>>>> If both cgroupv1 and v2 controllers are mounted, you might have
>>>> some problems. You might want to only look for
>>>> cgroup2 mounts if is_cgroupV2 is true otherwise only look for
>>>> cgroup mounts.
>>>
>>> OK.
>>>
>>>> 194 cg_infos[i]._mount_path =
>>>> os::strdup(tmp_mount_point);
>>>> This will stomp the pointer:
>>>> 213 cg_infos[memory_idx]._mount_path =
>>>> os::strdup(tmpmount);
>>>
>>> You are right. If we are on a hybrid hierarchy and for some reason
>>> the
>>> cgroup v2 mount shows up *after* cgroup mounts in
>>> /proc/self/mountinfo,
>>> we'd get the mount path wrong. So ordering matters! The reason this
>>> worked for me on hybrid is that cgroup2 mounts come *before* cgroup
>>> mounts on the system I've tested.
>>>
>>>> You should free all strdup’d pointers in cg_infos in places where
>>>> you fail and return NULL.
>>>>
>>>> 81 cg_infos[memory_idx]._name = os::strdup(name);
>>>> 213 cg_infos[memory_idx]._mount_path =
>>>> os::strdup(tmpmount);
>>>> 214 cg_infos[memory_idx]._root_mount_path =
>>>> os::strdup(tmproot);
>>>
>>> Done now. More thoughts?
>>>
>>> Updated webrev:
>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/03/webrev/
>>>
>>> Thanks,
>>> Severin
>>>
>>>>> On Feb 25, 2020, at 2:22 PM, Severin Gehwolf <
>>>>> sgehwolf at redhat.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Could I please get reviews of this cgroup-related patch? After
>>>>> JDK-
>>>>> 8230305 old systems with no mounted cgroup controllers would
>>>>> get
>>>>> detected as cgroups v2 (wrong). Then, when initializing the v2
>>>>> subsystem, it would fail. The trace about cgroupv2 mount point
>>>>> not
>>>>> found is misleading in this case. While the outcome is similar
>>>>> pre/post
>>>>> patch (NULL cgroup subsystem), I'd like to be explicit about
>>>>> this case.
>>>>>
>>>>> The suggested fix is to look at /proc/self/mountinfo in order
>>>>> to
>>>>> correctly detect whether this is cgroup v2 or cgroup v1 with no
>>>>> mounted
>>>>> controllers. In the latter case we just stop initialization as
>>>>> we'd
>>>>> fail later in cgroupv1 code anyway. This aligns hotspot code
>>>>> with core-
>>>>> libs after JDK-8239559.
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8239785
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8239785/01/webrev/
>>>>>
>>>>> Testing: jdk-submit, hotspot docker tests on cgroup v1 and
>>>>> cgroup v2. All pass.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Thanks,
>>>>> Severin
>>>>>
>>
>>
>
>
>
>
More information about the hotspot-dev
mailing list