RFR: 8307990: jspawnhelper must close its writing side of a pipe before reading from it [v7]

Thomas Stuefe stuefe at openjdk.org
Fri May 26 17:26:02 UTC 2023


On Thu, 25 May 2023 15:25:40 GMT, Volker Simonis <simonis at openjdk.org> wrote:

>> Since JDK13, executing commands in a sub-process defaults to the so called `POSIX_SPAWN` launching mechanism (i.e. `-Djdk.lang.Process.launchMechanism=POSIX_SPAWN`) on Linux. This works by using `posix_spawn(3)` to firstly start a tiny helper program called `jspawnhelper` in a subprocess. In a second step, `jspawnhelper` reads the command data from the parent Java process over a Unix pipe and finally executes (i.e. `execvp(3)`) the requested command.
>> 
>> In cases where the parent process terminates abnormally before `jspawnhelper` has read all the expected data from the pipe, `jspawnhelper` will block indefinitely on the reading end of the pipe. This is especially harmful if the parent process had open sockets, because in that case, the forked `jspawnhelper` process will inherit them and keep all the corresponding ports open effectively preventing other processes to bind to them. Notice that this is not an abstract scenario. We've observed this regularly in production with services which couldn't be restarted after a crash after migrating to JDK 17.
>> 
>> The fix of the issue is rather trivial. `jspawnhelper` has to close its writing end of the pipe which connects it with the parent Java process *before* starting to read from that pipe such that reads from the pipe can immediately return with EOF if the parent process terminates abnormally.
>> 
>> Also did some cleanup:
>>  - in `jspawnhelper.c` correctly chek the result of `readFully()`
>>  - in `ProcessImpl_md.c` use `restartableWrite()` instead of `write()` to write the command data from the parent process to `jspawnhelper`
>>  - in `childproc.{c,h}` remove remaining traces of `MODE_CLONE` because that's long gone.
>
> Volker Simonis has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix for failing CloseRace test

> Fixing the last bug :)
> 
> If we close the writing end of the pipe to jspawnhelper early in `spawnChild()` right after the last write (which I still think is a good idea), we have to use `c->childenv` rather than `childenv` when closing the pipe descriptors at the end of `Java_java_lang_ProcessImpl_forkAndExec()` in order to avoid double closing:
> 
> ```c++
> -    closeSafely(childenv[0]);
> -    closeSafely(childenv[1]);
> +    /* We use 'c->childenv' here rather than 'childenv' because 'spawnChild()' might have
> +     * already closed 'c->childenv[1]' and signaled this by setting 'c->childenv[1]' to '-1'.
> +     * Otherwise 'c->childenv' and 'childenv' are the same because we just copied 'childenv'
> +     * to 'c->childenv' (with 'copyPipe()') before calling 'startChild()'. */
> +    closeSafely(c->childenv[0]);
> +    closeSafely(c->childenv[1]);
> ```
> 
> This also fixes `test/jdk/java/lang/ProcessBuilder/CloseRace.java` which could failed sporadically the previous version of the change.

I missed that. Christ this stuff is complex :( 

Still think it would be cleaner and simpler to set the FD in the parent to CLOEXEC, before doing posix_spawn, and at the same time set the childStuff variable to -1 to prevent the child from attempting to re-close it. Reconsider?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13956#issuecomment-1564700036


More information about the core-libs-dev mailing list