RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)

Sebastian Lövdahl duke at openjdk.org
Thu May 2 10:33:52 UTC 2024


On Thu, 2 May 2024 10:13:51 GMT, Sebastian Lövdahl <duke at openjdk.org> wrote:

> 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)

Ran the following tests locally:


$ make test TEST="jtreg:test/hotspot/jtreg/containers"

...

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR   
   jtreg:test/hotspot/jtreg/containers                  14    14     0     0   
==============================
TEST SUCCESS



$ make test TEST="jtreg:test/hotspot/jtreg/serviceability"

...

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR   
   jtreg:test/hotspot/jtreg/serviceability             374   374     0     0   
==============================
TEST SUCCESS


And also manually tested it under various conditions. Basic environment information:


slovdahl at ubuntu2204:~/reproducer$ systemd --version
systemd 249 (249.11-0ubuntu3.12)
+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified
slovdahl at ubuntu2204:~/reproducer$ sudo apt-get install openjdk-17-jdk-headless
slovdahl at ubuntu2204:~/reproducer$ /usr/lib/jvm/java-17-openjdk-amd64/bin/java -version
openjdk version "17.0.10" 2024-01-16
OpenJDK Runtime Environment (build 17.0.10+7-Ubuntu-122.04.1)
OpenJDK 64-Bit Server VM (build 17.0.10+7-Ubuntu-122.04.1, mixed mode, sharing)
slovdahl at ubuntu2204:~/reproducer$ /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/java -version
openjdk version "23-internal" 2024-09-17
OpenJDK Runtime Environment (build 23-internal-adhoc.slovdahl.jdk)
OpenJDK 64-Bit Server VM (build 23-internal-adhoc.slovdahl.jdk, mixed mode, sharing)


Reproducer.java used for the target process:


import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.ServerSocket;

public class Reproducer {
  public static void main(String[] args) throws InterruptedException, IOException {
    int port;
    if (args.length > 0) {
      port = Integer.parseInt(args[0]);
    }
    else {
      port = 81;
    }

    System.out.println("Hello, World!");
    try (var server = new ServerSocket()) {
      server.bind(new InetSocketAddress("localhost", port));
      System.out.println("Bound to port " + port);
      while (true) {
        Thread.sleep(1_000L);
      }
    }
  }
}


systemd unit used for testing the fix for https://bugs.openjdk.org/browse/JDK-8226919:


slovdahl at ubuntu2204:~/reproducer$ cat reproducer.service
[Service]
Type=simple
ExecStart=/usr/lib/jvm/java-17-openjdk-amd64/bin/java /home/slovdahl/reproducer/Reproducer.java

User=slovdahl
Group=slovdahl
ReadWritePaths=/tmp
AmbientCapabilities=CAP_NET_BIND_SERVICE

slovdahl at ubuntu2204:~/reproducer$ sudo cp -a reproducer.service /etc/systemd/system/
slovdahl at ubuntu2204:~/reproducer$ sudo systemctl daemon-reload
slovdahl at ubuntu2204:~/reproducer$ sudo systemctl start reproducer.service
slovdahl at ubuntu2204:~/reproducer$ sudo systemctl status reproducer.service 
● reproducer.service
     Loaded: loaded (/etc/systemd/system/reproducer.service; static)
     Active: active (running) since Wed 2024-05-01 20:22:01 EEST; 4s ago
   Main PID: 1576835 (java)
      Tasks: 26 (limit: 76968)
     Memory: 69.2M
        CPU: 832ms
     CGroup: /system.slice/reproducer.service
             └─1576835 /usr/lib/jvm/java-17-openjdk-amd64/bin/java /home/slovdahl/reproducer/Reproducer.java

maj 01 20:22:01 ubuntu2204 systemd[1]: Started reproducer.service.
maj 01 20:22:01 ubuntu2204 java[1576835]: Hello, World!
maj 01 20:22:01 ubuntu2204 java[1576835]: Bound to port 81

slovdahl at ubuntu2204:~/reproducer$ ls -lh /proc/1576835/root
ls: cannot read symbolic link '/proc/1576835/root': Permission denied
lrwxrwxrwx 1 slovdahl slovdahl 0 maj  1 20:22 /proc/1576835/root
slovdahl at ubuntu2204:~/reproducer$ sudo ls -lh /proc/1576835/root
lrwxrwxrwx 1 slovdahl slovdahl 0 maj  1 20:22 /proc/1576835/root -> /


Fails with vanilla OpenJDK 17:


slovdahl at ubuntu2204:~/reproducer$ /usr/lib/jvm/java-17-openjdk-amd64/bin/jcmd 1576835 VM.version
1576835:
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file /proc/1576835/root/tmp/.java_pid1576835: target process 1576835 doesn't respond within 10500ms or HotSpot VM not loaded
	at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:104)
	at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
	at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
	at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
	at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)


Works when attaching as root with vanilla OpenJDK 17:


slovdahl at ubuntu2204:~/reproducer$ sudo /usr/lib/jvm/java-17-openjdk-amd64/bin/jcmd 1576835 VM.version
1576835:
OpenJDK 64-Bit Server VM version 17.0.10+7-Ubuntu-122.04.1
JDK 17.0.10


Still works without root with a JDK built from this PR (fixed in https://bugs.openjdk.org/browse/JDK-8226919):


slovdahl at ubuntu2204:~/reproducer$ /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/jcmd 1576835 VM.version
1576835:
OpenJDK 64-Bit Server VM version 17.0.10+7-Ubuntu-122.04.1
JDK 17.0.10


Attaching to a JVM inside a Docker container from the host works as before with vanilla OpenJDK 17 and the JDK built from this PR (always requires root):


slovdahl at ubuntu2204:~/reproducer$ docker run --rm -v .:/app -w /app eclipse-temurin:17 java Reproducer.java
Hello, World!
Bound to port 81

slovdahl at ubuntu2204:~/reproducer$ sudo jcmd -l | grep Reproducer
1587278 jdk.compiler/com.sun.tools.javac.launcher.Main Reproducer.java

slovdahl at ubuntu2204:~/reproducer$ sudo /usr/lib/jvm/java-17-openjdk-amd64/bin/jcmd 1587278 VM.version
1587278:
OpenJDK 64-Bit Server VM version 17.0.11+9
JDK 17.0.11

slovdahl at ubuntu2204:~/reproducer$ sudo /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/jcmd 1587278 VM.version
1587278:
OpenJDK 64-Bit Server VM version 17.0.11+9
JDK 17.0.11


Attaching to the same Docker container JVM from a sidecar container mounted into the same process namespace:
- Attaching works with Temurin 17, but `jcmd -l` does not list the process
- Attaching and `jcmd -l` works with Temurin 21
- Attaching and listing fails with current mainline JDK
- Attaching and listing works with the fix in this PR


slovdahl at ubuntu2204:~/reproducer$ docker run --interactive --tty --rm --pid=container:great_curie eclipse-temurin:17.0.11_9-jdk-jammy /bin/bash
root at e8fa333f7f71:/# jcmd
326 jdk.jcmd/sun.tools.jcmd.JCmd
root at e8fa333f7f71:/# jcmd 1 VM.version
1:
OpenJDK 64-Bit Server VM version 17.0.11+9
JDK 17.0.11

slovdahl at ubuntu2204:~/reproducer$ docker run --interactive --tty --rm --pid=container:great_curie eclipse-temurin:21.0.3_9-jdk-jammy /bin/bash
root at d7c84d951f3d:/# jcmd
384 jdk.jcmd/sun.tools.jcmd.JCmd
1 jdk.compiler/com.sun.tools.javac.launcher.Main Reproducer.java
root at d7c84d951f3d:/# jcmd 1 VM.version
1:
OpenJDK 64-Bit Server VM version 17.0.11+9
JDK 17.0.11

# Mainline JDK
slovdahl at ubuntu2204:~/reproducer$ docker run --interactive --tty --rm --pid=container:great_curie --volume /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/:/jdk ubuntu:22.04 /bin/bash
root at 10595dab1b4d:/# /jdk/bin/jcmd
1 jdk.compiler/com.sun.tools.javac.launcher.Main Reproducer.java
443 jdk.jcmd/sun.tools.jcmd.JCmd
root at 10595dab1b4d:/# /jdk/bin/jcmd 1 VM.version
1:
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file /tmp/.java_pid1: target process 1 doesn't respond within 10500ms or HotSpot VM not loaded
	at jdk.attach/sun.tools.attach.VirtualMachineImpl.<init>(VirtualMachineImpl.java:99)
	at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58)
	at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207)
	at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113)
	at jdk.jcmd/sun.tools.jcmd.JCmd.main(JCmd.java:97)

# JDK built from this PR
slovdahl at ubuntu2204:~/reproducer$ docker run --interactive --tty --rm --pid=container:great_curie --volume /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/:/jdk ubuntu:22.04 /bin/bash
root at 7b7ef4ba0b35:/# /jdk/bin/jcmd
1 jdk.compiler/com.sun.tools.javac.launcher.Main Reproducer.java
50 jdk.jcmd/sun.tools.jcmd.JCmd
root at 7b7ef4ba0b35:/# /jdk/bin/jcmd 1 VM.version
1:
OpenJDK 64-Bit Server VM version 17.0.11+9
JDK 17.0.11

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19055#issuecomment-2090136676


More information about the serviceability-dev mailing list