[jdk11u-dev] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy

Severin Gehwolf sgehwolf at openjdk.java.net
Thu Mar 10 14:33:56 UTC 2022


On Tue, 8 Mar 2022 18:47:32 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

> This is a followup backport for Cgroup v2 support.
> 
> The original patch does not apply cleanly, due to context differences and some of out-of-order backports, conflicts were resolved manually.
> 
> In additional:
> 
> 1) Include partial backport of JDK-8250627.
>     - Renamed `Metrics.c` =>  `CgroupMetrics.c`
>     - Added native method `isUseContainerSupport()` to `CgroupMetrics.java`
>     - Return instance only if `isUseContainerSupport()` is enabled.
>    
> 2) Add `createTempDirectory()` method to `test/lib/jdk/test/lib/Utils.java` for `TestCgroupSubsystemController.java` test
> 
> 3) Add `import jdk.test.lib.process.OutputAnalyzer;` to `TestDockerMemoryMetrics.java`
> 
> Test:
> 
> - [x] jtreg containers/docker test on Ubuntu 20.04.4 LTS
> - [x] jdk/internal/platform/cgroup on Ubuntu 20.04.4 LTS
> - [x] jdk/internal/platform/docker on Ubuntu 20.04.4 LTS

Hi Zhengyu!

Here is the first round of review comments. This is a bit hairy as we are doing backports out of orderand we need to be careful not to drop an already backported patch (applied to the old location which is now moved with this). I'll run this through my own testing, too.

Thanks,
Severin

src/java.base/linux/classes/jdk/internal/platform/CgroupUtil.java line 48:

> 46:             unwrapIOExceptionAndRethrow(e);
> 47:             throw new InternalError(e.getCause());
> 48:         }

Missing hunks of JDK-8255908: See https://github.com/openjdk/jdk/commit/8d9cf48e#diff-96f6e99fbc0d093a8b423d1a3fc86c0408768b2bb10747d926832d9286c6b3bbR49

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java line 80:

> 78:         } catch (IOException e) {
> 79:             return null;
> 80:         }

I believe the original, `Metrics.java` had a catch for `UncheckedIOException` here that's now gone.
https://github.com/openjdk/jdk11u-dev/blob/194b66f6a72437345c458e747e24beb46352537a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java#L89..L91

Please add it back as it's part of the fix of JDK-8255908.

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java line 108:

> 106:                 CgroupUtil.readFilePrivileged(Paths.get("/proc/self/cgroup"))) {
> 107: 
> 108:             lines.map(line -> line.split(":"))

We are missing the patch of JDK-8272124 here. Old code was:
https://github.com/openjdk/jdk11u-dev/blob/194b66f6a72437345c458e747e24beb46352537a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java#L119..L122

Best to use the JDK 17u commit as a base for this:
See: https://github.com/openjdk/jdk17u/commit/f0028333d83c39b434c26f66a6a92e446e7047cd#diff-7d6f202c58b4205f19a9b6c657f4827e9ef132d7433a7bf906b5b640f9f7841fR197

Looking at this again, we should bring in the test changes of JDK-8272124 too (as we are now bringing in those tests with the cgroups v2 backport). It helps to verify this fix is present as well.

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java line 114:

> 112:         } catch (IOException e) {
> 113:             return null;
> 114:         }

Same here: https://github.com/openjdk/jdk11u-dev/blob/194b66f6a72437345c458e747e24beb46352537a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java#L128..L130

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java line 168:

> 166:         controllerName = entry[1];
> 167:         base = entry[2];
> 168:         if (controllerName != null && base != null) {

We are missing the fix of https://bugs.openjdk.java.net/browse/JDK-8254854 here. Old code was:
https://github.com/openjdk/jdk11u-dev/blob/194b66f6a72437345c458e747e24beb46352537a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java#L200..L202

See head commit at the time:
https://github.com/openjdk/jdk/commit/a0b687bf

src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java line 216:

> 214:     }
> 215: 
> 216:     private void setActiveSubSystems() {

function `getSwapEnabled()` from JDK-8250984 is missing here. See:
https://github.com/openjdk/jdk11u-dev/blob/194b66f6a72437345c458e747e24beb46352537a/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/Metrics.java#L246..L249

src/java.base/share/classes/sun/launcher/LauncherHelper.java line 844:

> 842:             if (mainClass.getModule().isNamed()) {
> 843:                 abort(e, "java.launcher.module.error5",
> 844:                       mainClass.getName(), mainClass.getModule().getName(),

This seems an unrelated change, please remove. Seems to come from JDK-8221368, not in jdk11u.

src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java line 215:

> 213:                 if (deltaLimit <= 0) {
> 214:                     return 0;
> 215:                 }

This seems to revert https://bugs.openjdk.java.net/browse/JDK-8242480. Please remove.

src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java line 219:

> 217:                         // If this happens just retry the loop for a few iterations.
> 218:                         if ((memSwapUsage - memUsage) >= 0) {
> 219:                             return memSwapLimit - memLimit - (memSwapUsage - memUsage);

This seems to revert https://bugs.openjdk.java.net/browse/JDK-8242480. Please remove.

src/jdk.management/unix/classes/com/sun/management/internal/OperatingSystemImpl.java line 277:

> 275:     private boolean isCpuSetSameAsHostCpuSet() {
> 276:         if (containerMetrics != null && containerMetrics.getCpuSetCpus() != null) {
> 277:             return containerMetrics.getCpuSetCpus().length == getHostConfiguredCpuCount0();

Line 277 is a partial revert of https://bugs.openjdk.java.net/browse/JDK-8247469. Please remove. Only line 276 is needed.

test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java line 30:

> 28: 
> 29:     private static final long UNLIMITED = -1;
> 30: 

Seems a revert of https://bugs.openjdk.java.net/browse/JDK-8275713. Please remove.

test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java line 77:

> 75:                 // Break out as soon as we see an increase in failcount
> 76:                 // to avoid getting killed by the OOM killer.
> 77:                 if (Metrics.systemMetrics().getMemoryFailCount() > count) {

This seems a partial revert of https://bugs.openjdk.java.net/browse/JDK-8250984. Please double check.

test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java line 93:

> 91:             throw new RuntimeException("Memory fail count : new : ["
> 92:                     + Metrics.systemMetrics().getMemoryFailCount() + "]"
> 93:                     + ", old : [" + count + "]");

Same here. Re revert of https://bugs.openjdk.java.net/browse/JDK-8250984. Please check.

test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java line 118:

> 116:         // This feature is deprecated. Only perform the check if we get an actual
> 117:         // limit back.
> 118:         if (kmemlimit != UNLIMITED && limit != kmemlimit) {

Seems a revert of https://bugs.openjdk.java.net/browse/JDK-8275713. Please remove.

test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java line 134:

> 132:         if (expectedMem != Metrics.systemMetrics().getMemoryLimit()
> 133:                 || expectedMemAndSwap != Metrics.systemMetrics().getMemoryAndSwapLimit()) {
> 134:             System.err.println("Memory and swap limit not equal, expected : ["

Revert of https://bugs.openjdk.java.net/browse/JDK-8250984 ? Please check.

test/lib/jdk/test/lib/Utils.java line 784:

> 782:         Path dir = Paths.get(System.getProperty("user.dir", "."));
> 783:         return Files.createTempDirectory(dir, prefix, attrs);
> 784:     }

This looks like a partial backport of https://bugs.openjdk.java.net/browse/JDK-8223396. It's a test-only change, cleanup. Perhaps worth bringing into jdk11u wholesale as a dep? Failing that, rework usages of `Utils.createTempDirectory()`?

test/lib/jdk/test/lib/containers/cgroup/MetricsTesterCgroupV1.java line 291:

> 289:         newVal = getLongValueFromFile(Controller.MEMORY, "memory.memsw.failcnt");
> 290:         if (!CgroupMetricsTester.compareWithErrorMargin(oldVal, newVal)) {
> 291:             fail(Controller.MEMORY, "memory.memsw.failcnt", oldVal, newVal);

This seems to be missing JDK-8250984 changes (reverts them). Please check. This file is basically a move from `MetricsTester.java`. So we should bring in all the changes it since collected before moving.

-------------

Changes requested by sgehwolf (Reviewer).

PR: https://git.openjdk.java.net/jdk11u-dev/pull/863


More information about the jdk-updates-dev mailing list