RFR: 8343191: Cgroup v1 subsystem fails to set subsystem path [v3]

Sergey Chernyshev schernyshev at openjdk.org
Tue Nov 12 15:03:36 UTC 2024


On Mon, 11 Nov 2024 10:23:02 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:

> The JBS issue doesn't mention `NullPointerException`. It would be good to list the observed NPE issue.

Example for NPE:


public class Test {
    public static void main(String[] args) {
        java.lang.management.ManagementFactory.getPlatformMBeanServer();
        System.out.println("PASSED.");
    }
}

Script (cg v1):

sudo docker run --tty=true --rm --volume=$JAVA_HOME:/jdk --volume=./classes:/classes:ro --memory 400m ubuntu:latest \
    sh -c "sleep 10 ; /jdk/bin/java -cp .:/classes Test" &
sleep 10;
HOSTPID=$(sudo ps -ef | awk '/jdk/bin/java/ && !/docker/ && !/awk/ { print $2 }')
echo $HOSTPID | sudo tee /sys/fs/cgroup/memory/test/cgroup.procs > /dev/null
sleep 10

Result (cg v1 before patch):

Exception in thread "main" java.lang.NullPointerException
	at java.base/java.util.Objects.requireNonNull(Objects.java:220)
	at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:296)
	at java.base/java.nio.file.Path.of(Path.java:148)
	at java.base/java.nio.file.Paths.get(Paths.java:69)
	at java.base/jdk.internal.platform.CgroupUtil.lambda$readStringValue$0(CgroupUtil.java:67)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:571)
	at java.base/jdk.internal.platform.CgroupUtil.readStringValue(CgroupUtil.java:69)
	at java.base/jdk.internal.platform.CgroupSubsystemController.getStringValue(CgroupSubsystemController.java:65)
	at java.base/jdk.internal.platform.CgroupSubsystemController.getLongValue(CgroupSubsystemController.java:124)
	at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getLongValue(CgroupV1Subsystem.java:190)
	at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getHierarchical(CgroupV1Subsystem.java:160)
	at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.initSubSystem(CgroupV1Subsystem.java:85)
	at java.base/jdk.internal.platform.cgroupv1.CgroupV1Subsystem.getInstance(CgroupV1Subsystem.java:61)
	at java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:119)
	at java.base/jdk.internal.platform.CgroupSubsystemFactory.create(CgroupSubsystemFactory.java:89)
	at java.base/jdk.internal.platform.CgroupMetrics.getInstance(CgroupMetrics.java:198)
	at java.base/jdk.internal.platform.SystemMetrics.instance(SystemMetrics.java:29)
	at java.base/jdk.internal.platform.Metrics.systemMetrics(Metrics.java:58)
	at java.base/jdk.internal.platform.Container.metrics(Container.java:43)
	at jdk.management/com.sun.management.internal.OperatingSystemImpl.<init>(OperatingSystemImpl.java:175)
	at jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl.getOperatingSystemMXBean(PlatformMBeanProviderImpl.java:316)
	at jdk.management/com.sun.management.internal.PlatformMBeanProviderImpl$4.nameToMBeanMap(PlatformMBeanProviderImpl.java:235)
	at java.management/java.lang.management.ManagementFactory.lambda$getPlatformMBeanServer$0(ManagementFactory.java:489)
	at java.base/java.util.stream.ReferencePipeline$7$1FlatMap.accept(ReferencePipeline.java:289)
	at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1788)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:570)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:560)
	at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:153)
	at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:176)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
	at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:636)
	at java.management/java.lang.management.ManagementFactory.getPlatformMBeanServer(ManagementFactory.java:490)
	at Test.main(Test.java:3)

After patch:

[0.001s][warning][os,container] Cgroup v1 controller (/sys/fs/cgroup/memory) mounting root [/docker/e7ecd9685bcbbd3e7d3e81ad7c23cadf5d96db85c324f66d290f0d289ad867dd] doesn't match cgroup [/test]
[0.001s][warning][os,container] Cgroup v1 controller (/sys/fs/cgroup/memory) mounting root [/docker/e7ecd9685bcbbd3e7d3e81ad7c23cadf5d96db85c324f66d290f0d289ad867dd] doesn't match cgroup [/]
[0.001s][warning][os,container] Cgroup v1 controller (/sys/fs/cgroup/memory) mounting root [/docker/e7ecd9685bcbbd3e7d3e81ad7c23cadf5d96db85c324f66d290f0d289ad867dd] doesn't match cgroup [/]
Nov 12, 2024 1:14:11 PM jdk.internal.platform.cgroupv1.CgroupV1SubsystemController setPath
WARNING: Cgroup v1 controller (/sys/fs/cgroup/memory) mounting root [/docker/e7ecd9685bcbbd3e7d3e81ad7c23cadf5d96db85c324f66d290f0d289ad867dd] doesn't match cgroup [/test].
PASSED.

In cg v2 the NPE is not observed.


> > Only when docker fails to mount the cgroup while moving process to an outer group or a sibling group. It's probably not the case with CloudFoundry.
> 
> The bug suggests it's a cg v1 only problem, but I'm able to reproduce in cg v2 too. We should handle both cases more gracefully.

what gets reproduced in cg v2 is in fact a different issue, the cgroup files are not mapped inside the container.
Nothing can be done in Java to overcome this. There are no files in /sys/fs/cgroup.

The only workaround is --cgroupns=host, then the correct limits are displayed.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21808#issuecomment-2470765864


More information about the core-libs-dev mailing list