RFR(s): 8210549: Runtime.exec: in closeDescriptors(), use FD_CLOEXEC instead of close()

Thomas Stüfe thomas.stuefe at gmail.com
Wed Sep 12 11:52:31 UTC 2018


Thank you Christoph :)

On Wed, Sep 12, 2018 at 1:47 PM, Langer, Christoph
<christoph.langer at sap.com> wrote:
> Hi Thomas,
>
> I like the full patch �� +1
>
> Best regards
> Christoph
>
>> -----Original Message-----
>> From: core-libs-dev <core-libs-dev-bounces at openjdk.java.net> On Behalf
>> Of Thomas Stüfe
>> Sent: Dienstag, 11. September 2018 19:27
>> To: Java Core Libs <core-libs-dev at openjdk.java.net>
>> Subject: RFR(s): 8210549: Runtime.exec: in closeDescriptors(), use
>> FD_CLOEXEC instead of close()
>>
>> Hi all,
>>
>> May I please have reviews for this small fix:
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8210549
>>
>> patch (full): http://cr.openjdk.java.net/~stuefe/webrevs/8210549-Use-
>> FD_CLOEXEC-instead-of-close/webrev.00/webrev/
>> patch (minimal alternative):
>> http://cr.openjdk.java.net/~stuefe/webrevs/8210549-Use-FD_CLOEXEC-
>> instead-of-close/webrev.minimal/webrev/
>>
>> See also prior discussion:
>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-
>> September/055173.html
>>
>> Submit tests ran with full patch successfully through.
>>
>> ---
>>
>> Background:
>>
>> After fork()/vfork() and before exec(), the child process needs to
>> close all inherited file descriptors apart from the stdin/out/err pipe
>> ends. We do this by iterating thru all file descriptors in
>> /proc/<pid>/fd or whatever the equivalent is on that platform. This is
>> done using opendir(), readdir().
>>
>> We then close all these file descriptors using close().
>>
>> The problem with that technique is that we may accidentally close the
>> file descriptor opendir() is using internally to read the directory
>> content of /proc/<pid>/fd, thereby sawing off the branch we are
>> sitting on. The coding does some magic to attempt to prevent this:
>>
>> <quote>
>> 86    /* We're trying to close all file descriptors, but opendir() might
>> 87     * itself be implemented using a file descriptor, and we certainly
>> 88     * don't want to close that while it's in use.  We assume that if
>> 89     * opendir() is implemented using a file descriptor, then it uses
>> 90     * the lowest numbered file descriptor, just like open().  So we
>> 91     * close a couple explicitly.  */
>> 92
>> 93    close(from_fd);          /* for possible use by opendir() */
>> 94    close(from_fd + 1);      /* another one for good luck */
>>
>> ...
>> 108    while ((dirp = readdir64(dp)) != NULL) {
>> 109        int fd;
>> 110        if (isAsciiDigit(dirp->d_name[0]) &&
>> 111            (fd = strtol(dirp->d_name, NULL, 10)) >= from_fd +
>> 2)Improve "close all filedescriptors" coding.
>> 112            close(fd);
>> 113    }
>>
>> </quote>
>>
>> This workaround can be removed if, instead of outright closing the
>> file descriptor in the loop, we were to set the file descriptor to
>> FD_CLOEXEC. Closing all descriptors would be defered to the actual
>> exec() call some milliseconds later.
>>
>> ----
>>
>> The patch changes the close() call to fcntl(FD_CLOEXEC). This removes
>> the need for the workaround as described.
>>
>> In addition to that, the full version of the patch also adds error
>> handling to the readdir() loop.
>>
>> But note that this exposes us to a behavioral change should we run
>> into a readdir() error:
>>
>> - currently, we would leave the remaining file descriptors quietly
>> unclosed. This usually would have no effect, but in rare cases may
>> lead to difficult-to-analyze errors when child processes accidentally
>> disturb parent process IO.
>>
>> - With this patch, we will fail if readdir fails. However, we do not
>> just yet fail the Runtime.exec() call, but enter a fallback code
>> section here:
>>
>> 364     /* close everything */
>> 365     if (closeDescriptors() == 0) { /* failed,  close the old way */
>> 366         int max_fd = (int)sysconf(_SC_OPEN_MAX);
>> 367         int fd;
>> 368         for (fd = FAIL_FILENO + 1; fd < max_fd; fd++)
>> 369             if (close(fd) == -1 && errno != EBADF)
>> 370                 goto WhyCantJohnnyExec;
>> 371     }
>>
>> I am not convinced this is a good fallback. _SC_OPEN_MAX depends on
>> ulimit -Hn and may be high. On my Ubuntu 16.4 box, default value after
>> installation is 1048576. When we hit this fallback code, it takes
>> almost a full second to spawn a single child process. In fork
>> intensive scenarios this can get pretty bad.
>>
>> What do you think?
>>
>> Thanks & Best Regards,
>>
>> Thomas


More information about the core-libs-dev mailing list