RFR: 8307990: jspawnhelper must close its writing side of a pipe before reading from it

Volker Simonis simonis at openjdk.org
Mon May 15 16:14:51 UTC 2023


On Fri, 12 May 2023 15:24:19 GMT, Volker Simonis <simonis at openjdk.org> wrote:

> Since JDK13, executing commands in a sub-process defaults to the so called `POSIX_SPAWN` launching mechanism (i.e. `-Djdk.lang.Process.launchMechanism=POSIX_SPAWN`) on Linux. This works by using `posix_spawn(3)` to firstly start a tiny helper program called `jspawnhelper` in a subprocess. In a second step, `jspawnhelper` reads the command data from the parent Java process over a Unix pipe and finally executes (i.e. `execvp(3)`) the requested command.
> 
> In cases where the parent process terminates abnormally before `jspawnhelper` has read all the expected data from the pipe, `jspawnhelper` will block indefinitely on the reading end of the pipe. This is especially harmful if the parent process had open sockets, because in that case, the forked `jspawnhelper` process will inherit them and keep all the corresponding ports open effectively preventing other processes to bind to them. Notice that this is not an abstract scenario. We've observed this regularly in production with services which couldn't be restarted after a crash after migrating to JDK 17.
> 
> The fix of the issue is rather trivial. `jspawnhelper` has to close its writing end of the pipe which connects it with the parent Java process *before* starting to read from that pipe such that reads from the pipe can immediately return with EOF if the parent process terminates abnormally.
> 
> Also did some cleanup:
>  - in `jspawnhelper.c` correctly chek the result of `readFully()`
>  - in `ProcessImpl_md.c` use `restartableWrite()` instead of `write()` to write the command data from the parent process to `jspawnhelper`
>  - in `childproc.{c,h}` remove remaining traces of `MODE_CLONE` because that's long gone.

>     1. I think the child should probably close (but without error handling) the _read_ end of the fail pipe too _before_ sending out the liveness ping.
> 
> 
> https://github.com/openjdk/jdk/blob/020a60e6449749ca979c1ace3af7db57f4c53a5f/src/java.base/unix/native/libjava/childproc.c#L335
> 
> Because the same problem could arise if the parent were to terminate. Child writes to the pipe, and theoretically could block on a full pipe since it holds open an fd to the read end. I admit this is highly theoretical since the pipe would have to be full for this to happen, and nobody wrote to the pipe yet, and the aliveness ping is just an integer (4 bytes). But to be more correct, and since the fix is so trivial, the fail pipe read end should be closed in the child process before writing to fail pipe.

I thought about this, but as you said, I think it can not happen as we only write at most two times 4 bytes. Also, if we close the read end of the fail pipe in the child and the parent crashes (i.e. there's no read end at all), we will get a SIGPIPE which we would then have to handle in `jspawnhelper`. In the end, I don't think that's worth it for an event which should never happen (at least not in this PR).

> 
>     2. I think you don't actually have to hand in the in-pipe-read-end fd number via command line arg, just to have the child to close it. You could just, in the parent, set the fd to FD_CLOEXEC. Since posix_spawn() exec's the spawn helper, this would close the file descriptor for you. Extending this thought, you could do this with all pipe ends you want to cauterize in the child process.

I've deliberately not used `FD_CLOEXEC`  because the file descriptor closing code in `childProcess()` is shared by all launch mechanisms. So to make it work, I'd had to reset the corresponding file descriptors to `-1` in the child anyway.

I therefor felt the current fix is smaller, simpler and easier to understand.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13956#issuecomment-1548154907


More information about the core-libs-dev mailing list