RFR: 8286212: Cgroup v1 initialization causes NPE on some systems

Severin Gehwolf sgehwolf at openjdk.java.net
Thu May 12 10:07:31 UTC 2022


On Thu, 12 May 2022 06:11:03 GMT, Ioi Lam <iklam at openjdk.org> wrote:

>> Please review this change to the cgroup v1 subsystem which makes it more resilient on some of the stranger systems. Unfortunately, I wasn't able to re-create a similar system as the reporter. The idea of using the longest substring match for the cgroupv1 file paths was based on the fact that on systemd systems processes run in separate scopes and the maven forked test runner might exhibit this property. For that it makes sense to use the common ancestor path. Nothing changes in the common cases where the `cgroup_path` matches `_root` and when the `_root` is `/` (container the former, host system the latter).
>> 
>> In addition, the code paths which are susceptible to throw NPE have been hardened to catch those situations. Should it happen in the future it makes more sense (to me) to not have accurate container detection in favor of continuing to keep running.
>> 
>> Finally, with the added unit-tests a bug was uncovered on the "substring" match case of cgroup paths in hotspot. `p` returned from `strstr` can never point to `_root` as it's used as the "needle" to find in "haystack" `cgroup_path` (not the other way round).
>> 
>> Testing:
>> - [x] Added unit tests
>> - [x] GHA
>> - [x] Container tests on cgroups v1 Linux. Continue to pass
>
> src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1SubsystemController.java line 65:
> 
>> 63:                             path = mountPoint + cgroupSubstr;
>> 64:                         }
>> 65:                     } else {
> 
> Looks like `path` is still not set if the condition at line 61 `if (cgroupPath.length() > root.length()) {` is false.

Yes. There are more cases where the cgroup path might be null. If you know of cases that should be handled and aren't, please do let me know. Incidentally, that's why I've added the catch for NPE in `CgroupV1Subsystem.initSubSystem()`, because it's not clear what should be used as `path` in those corner cases. It certainly shouldn't throw a NPE :-). It then also more closely matches what hotspot does. Having said that, if `cgroupPath.length < root.length()` it's implied that `cgroupPath.startsWith(root)` is false. Then the only case where it would still be null is when the paths match in lenght, but aren't the same.

I'm not convinced the logic when then cgroup patch starts with the root path, then it should do what it does today. I.e. given `mountpath=/sys/fs/cgroup/memory`, `root=/foo/bar` and `cgroupPath=/foo/bar/baz` then `path=/sys/fs/cgroup/memory/baz`. I've personally not seen such a setup and the code predates me. Considering that the equivalent hotspot code had a bug in this logic since day 1, it doesn't seem very widely used...

The most common cases I know to date are:

1. `root=/`, `cgroupPath=/some`, `mount=/sys/fs/cgroup/controller` => `path=/sys/fs/cgroup/controller/some` (host systems)
2. `root=/foo/bar/baz`, `cgroupPath=/foo/bar/baz`, `mount=/sys/fs/cgroup/controller` => `path=/sys/fs/cgroup/controller` (most container engines)

-------------

PR: https://git.openjdk.java.net/jdk/pull/8629


More information about the core-libs-dev mailing list