RFR: 8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist [v2]

Volker Simonis simonis at openjdk.java.net
Mon Sep 28 08:58:08 UTC 2020


On Fri, 25 Sep 2020 14:45:16 GMT, Bob Vandette <bobv at openjdk.org> wrote:

>>> In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one
>>> comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second
>>> non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which
>>> we should respect.  As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not
>>> better than assuming they are all mounted in the same subdirectory.  If you follow my suggestion, then we will only be
>>> subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset.
>> 
>> Like this perhaps?
>> https://github.com/jerboaa/jdk/commit/45608cf9a13068f3724cc65d12d8dc819bb2d066
>> 
>> I tend to think that in actual container workloads it might be unlikely to actually see multiple cpuset mounts. So in a
>> sense that's a special case. The general case, before this bug, shouldn't be penalized. So perhaps the above would be
>> worth considering.
>
>> > In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one
>> > comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second
>> > non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which
>> > we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not
>> > better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only be
>> > subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset.
>> 
>> Like this perhaps?
>> [jerboaa at 45608cf](https://github.com/jerboaa/jdk/commit/45608cf9a13068f3724cc65d12d8dc819bb2d066)
>> 
>> I tend to think that in actual container workloads it might be unlikely to actually see multiple cpuset mounts. So in a
>> sense that's a special case. The general case, before this bug, shouldn't be penalized. So perhaps the above would be
>> worth considering.
> 
> Yes, that's exactly what I was thinking.  In the manual case from your example, we'd record "/" as the mount point if
> it showed up first and then overwrite it if /sys/fs/cgroup came along.

OK, I just want to get this done so I can finally use debug builds again.

I've picked your version now in the hope that you'll review that :)

I only changed the condition
strcmp(cg_infos[CPUSET_IDX]._mount_path, "/sys/fs/cgroup") < 0
to
strstr(cg_infos[CPUSET_IDX]._mount_path, "/sys/fs/cgroup") != cg_infos[CPUSET_IDX]._mount_path
otherwise you'd still choose the alternative cpuset if that was mounted on a mount point which is lexicographically
after `/sys/fs/cgroup` (e.g. `/tmp/cgroups`).

I've also adapted the logs to reflect the fact that the current solution will simply choose the cpusets from the second
mount in the (unusual?) case where the "*normal*" Cgroups are not mounted to `/sys/fs/cgroup`.

Hope that's still fine.

-------------

PR: https://git.openjdk.java.net/jdk/pull/295


More information about the hotspot-runtime-dev mailing list