RFR: 8342449: reimplement: JDK-8327114 Attach in Linux may have wrong behavior when pid == ns_pid [v3]
Serguei Spitsyn
sspitsyn at openjdk.org
Thu Nov 7 04:51:45 UTC 2024
On Wed, 30 Oct 2024 17:20:49 GMT, Larry Cable <duke at openjdk.org> wrote:
>> the implementation I originally provided does not in fact solve the issue!
>>
>> the attach protocol initiation "handshake" requires that the "attacher" (the caller of this code) and the "attachee"(the target JVM to be "attached" to) *must* share a "/tmp" (or access to the attachee's `cwd`) in common in order to rendezvous on the "attach" socket (created in the /tmp or attachee `cwd` filesystem).
>>
>> "attacher" and "attachee" JVM processes are guaranteed to share a common directory/filesystem when thy occupy the same "mount namespace", this is the environment in which "peers" exist, and the attach
>> handshake (initiated by the attacher) can use "/tmp" to establish the socket connection with the attachee.
>>
>> with the advent of "containers" (implemented in Linux via various namespaces, e.g.: pid & mount) another "attacher" and "attachee" relationship exists, that of "attacher" (namespace ancestor) and "attachee" (namespace descendant).
>>
>> in this environment it is possible (and highly probable) that the "attacher" and the "attachee" do not share a directory in common.
>>
>> In this scenario the "attacher" must resort to handshaking with the attachee via the /proc filesystem in order to access the "attachee's" directory from the "attacher".
>>
>> In order to achieve this rendezvous, the "attachee" must occupy a descendant, or same, (pid) namespace of, or as, the "attacher".
>>
>> since (pid) namespaces are hierarchical, a descendant process (in its own descendent pid namespace) will also occupy all its ancestor (pid) namespaces (between it and the 'root' or 'host' pid namespace) with a unique pid in each of those "interstitial" (pid) namespace(s).
>>
>> thus the "attachee" directory is accessible, via an "ancestor's" (or peer's) /proc filesystem using the pid of the "attachee" that is associated with it in that (pid) namespace.
>>
>> thus an "ancestor" "attacher" can handshake with a descendant "attachee" in this fashion.
>>
>> therefore an "attacher" has two choices when attempting to attach:
>>
>> - use the /proc/<pid>/root/tmp path to the "attachee's" /tmp (or its cwd)
>> - this works with both peers and descendants
>>
>> - use /tmp
>> - this only works if the "attacher" and "attachee" share a /tmp in common
>>
>> the obvious choice is to default to /proc/<pid>/root/tmp (or cwd) however there is an issue with this; should the attachee have elevated privileges, the attacher may not have r/w permission on the attachee's /proc/<pid>/root (or cwd) path.
>>
>> I...
>
> Larry Cable has updated the pull request incrementally with one additional commit since the last revision:
>
> JDK-8342449: fixed missing param in throws msg and renamed local var
src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java line 268:
> 266: * however we can also check the target pid's signal masks to see if it catches SIGQUIT and only do so if in
> 267: * fact it does ... this reduces the risk of killing an innocent process in the current ns as opposed to
> 268: * attaching to the actual target JVM ... c.f: checkCatchesAndSendQuitTo() below.
This still looks pretty risky and dangerous. I'm thinking if it'd make sense to check that the target process is a VM process. At least, we know how VM processes can use the signal mask. Another concern is that the target VM process might be not exactly the same VM the client was expecting to attach. I don't know what can be done to ensure this correctness.
src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java line 327:
> 325: private static final String SIGNAL_MASK_PATTERN = "(?<" + FIELD + ">Sig\\p{Alpha}{3}):\\s+(?<" + MASK + ">\\p{XDigit}{16}).*";
> 326:
> 327: private static final long SIGQUIT = 1L << 2;
Q: why not just 3L ? :)
src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java line 340:
> 338: var readBlk = false;
> 339: var readIgn = false;
> 340: var readCgt = false;
Nit:
- Minor suggestion to order the lines 338-430 same as lines 330-332 and place them after them
- Never liked single-letter identifiers because they are not well searchable. I'd suggest to:
rename `p` to something like `pattern` or `ptrn` and `m` to something like `matcher` or `match`.
src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java line 364:
> 362: case "SigIgn": { quitIgn = sigquit; readCgt = true; break; }
> 363: case "SigCgt": { quitCgt = sigquit; readIgn = true; break; }
> 364: }
Q: The `readIgn` and `readCgt` are swapped at lines 362-363. Is it intentional or just a typo?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21688#discussion_r1832048612
PR Review Comment: https://git.openjdk.org/jdk/pull/21688#discussion_r1832037498
PR Review Comment: https://git.openjdk.org/jdk/pull/21688#discussion_r1832040824
PR Review Comment: https://git.openjdk.org/jdk/pull/21688#discussion_r1832042138
More information about the serviceability-dev
mailing list