Runtime.exec : vfork() concerns and a fix proposal

Thomas Stüfe thomas.stuefe at gmail.com
Wed Sep 12 07:04:04 UTC 2018


Hi David,

On Tue, Sep 11, 2018 at 8:29 PM, David Lloyd <david.lloyd at redhat.com> wrote:
> I think this is a cool idea.

Thanks. I think I did not come up with it though, I think the
technique was known already.

> Do you have any performance numbers?

Sure:

small program, just spawning off /bin/true a 1000 times, measured on
my t450s running Ubuntu 16.4:

Number open files:   <none>           1000       100000

openjdk8:                  305ms             1.5s        115s
sapjvm8:                   721ms              2.3s       142s

factor                         2.4                   1.53        1.23

So, it starts off with factor 2.3, but penalty diminishes with the
number of open files. This comparison is a imprecise however since we
compare different JVMs with completely different Runtime.exec()
implementations. We do more checks in our JVM, which may mean more
syscalls per fork().

..Thomas

> On Tue, Sep 11, 2018 at 12:52 PM Thomas Stüfe <thomas.stuefe at gmail.com> wrote:
>>
>> Hi all,
>>
>> I wanted to gauge opinions on the following issue:
>>
>> Runtime.exec, on Linux, uses vfork(2) by default. It gives us better
>> performance compared with fork() and robustness in constrained memory
>> situations.
>>
>> But as we know vfork() can be dangerous if used incorrectly. In the
>> child process before exec'ing, we live in the memory of the parent
>> process. If we are not very careful we can influence or crash the
>> parent process.
>>
>> According to POSIX pretty much the only thing the child process is
>> allowed to do after vfork(2) is to exec(3) immediately; if that fails,
>> you must call _exit(2).
>>
>> http://pubs.opengroup.org/onlinepubs/009604599/functions/vfork.html
>>
>> However, in the openjdk we do a number of things beyond that:
>>
>> - stdin,out,err pipe handling business
>> - closing all file descriptors
>> - we change the working directory
>> - we may actually modify errno manually
>> - in case exec fails, we communicate the error back to the parent using pipe.
>>
>> This involves calling a number of libc functions beyond exec(), namely
>> read, close, dup2, opendir/readdir, write, chdir... It also needs a
>> bit of stack, since we assemble path names.
>>
>> --
>>
>> I was curious whether there were any real issues, so I tested (on
>> Ubuntu 16.4) and found:
>>
>> 1) A crash - any crash - in the child process before exec() will kill
>> the parent jvm dead. Weirdly enough, we do not even enter our error
>> handling, but seem to die instantly with the default "Segmentation
>> Fault".
>>
>> 2) Signals received by the child process before exec() influence the
>> parent process. For example:
>>  - SIGINT set to the child ends both parent and child, immediately
>>  - SIGABRT aborts both child and parent
>>  - any error signal sent to the child lead to the behavior described at (1)
>>
>> 3) A stack overflow in the child before exec() also kills the parent.
>> Unsurprising, since guard page hit -> segfault -> see (1).
>>
>> 4) more amusing, setting errno in the child before exec() changes the
>> errno in the parent process. propagates to the parent process.
>> But since errno is thread local and the thread in the parent process
>> is waiting in vfork() and will, upon return, not look at errno (after
>> all, vfork succeeded) this causes no trouble.
>>
>> There may be more issues, but these were the ones I tested.
>>
>> In all cases I counter-tested with fork() instead of vfork() and as
>> expected  with fork() the parent process stays unaffected as it should
>> be.
>>
>> -------------
>>
>> Whether you think these issues are worth solving is an open question.
>>
>> All these cases may happen in the wild (well, apart from
>> crash-by-programming-error if one assumes the program to be really bug
>> free) albeit with a very small probability. But once these bugs occur,
>> they can be very difficult to analyse. So fixing this may be
>> worthwhile.
>>
>> At SAP, we opted for robustness, so we changed the Runtime.exec()
>> implementation to deal with vfork() issues. Basically, we employ the
>> exec-twice technique:
>>
>> - in the child, after the vfork(), we immediately exec() into a little
>> bootstrap binary ("forkhelper").
>> - Now we are safe in the sense that we do not share memory with the
>> parent process anymore
>> - Then, parent process communicates with the child via pipes and gives
>> it all information needed to do the "real" exec: environ, current dir,
>> arguments... .
>> - Now the child exec's a second time, this time into the real target binary.
>>
>> The point of this technique is that we minimize the window in the
>> child between vfork and the first exec. In fact, we are now fully
>> POSIX compliant. This solves the described pathological cases.
>>
>> It has some other advantages too, e.g. allowing for better error
>> handling and tracing in the Runtime.exec() area. Performance-wise it
>> depends: we exec twice, so we pay twice. However, since the parent
>> continues execution after the first exec, it spends less time waiting
>> on the child process, which can make a difference if there are many
>> file descriptors open.
>>
>> ---
>>
>> Checking opinions here. Do you think we are okay with our current
>> implementation or would a change as described above be welcome in the
>> OpenJDK too?
>>
>> Thanks, and Best Regards, Thomas
>
>
>
> --
> - DML


More information about the core-libs-dev mailing list