Question about libjava/childproc.c

Sun Sep 9 00:34:03 UTC 2018

On Wed, Sep 5, 2018 at 11:43 AM, Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:

> On Wed, Sep 5, 2018 at 6:43 PM, Martin Buchholz <martinrb at google.com>
> wrote:
>
> > So before the readdir loop we should set errno to 0.  Then when readdir
> > returns NULL, we should check whether errno is non-zero.
> >
>
> Note that if we fix this, we may run into errors which were hidden
> before, namely if we really accidentally did close the file descriptor
> used by opendir(). Currently, even if some file descriptors remain
> unclosed, that may not necessarily lead to an error. But with your
> suggested error handling we would straight away fail.
>
> Therefore I would combine both fixes - add readdir error handling and
> also change to fcntl(FD_CLOEXEC).
>
>
I support this plan.  We expect most file descriptors to already have
the FD_CLOEXEC
flag set, so a useful optimization is to set the flag only when not already
set.

Probably the code should be conditional on defined(FD_CLOEXEC).

> > ---
> >
> > We do have a way of reporting errno to the parent.  See
> WhyCantJohnnyExec.
>
> Ah I see.
>
> Btw, do you know what happens when we modify errno in a vforked child
> process? Would that not modify errno in the parent process too? (errno
> is probably implemented as thread local variable, and since the parent
> thread sleeps until we exec this may still be okay, or?).
>
>
It's a good question.  vfork is typically underspecified, so it is
plausible that changing errno in the child affects the parent.  I think we
should simply make sure that the parent code is robust against this (as it
typically is - you check errno when a system API has already returned an
indication of an error)

> ---
>
> A bit more background, if you are interested:
>
> A long time ago, at SAP we swapped Oracle's implementation of
> Runtime.exec on Unix against a completely homegrown one. The reason
> was that we needed a really robust implementation for our customers,
> and with the then current implementation we kept running into
> limitations and pathological corner cases. Especially on outlier
> platforms like HPUX or AS/400.
>
> This was long before OpenJDK existed, so there was no way for us to
> just fix the coding upstream.

In 2005 I was the primary maintainer of the subprocess code at Sun, and
there is a good chance you could have gotten changes in if you had asked me!

> But now there is, so at regular
> intervals I keep eyeing our implementation to check if we could merge.
>
> The main difference between our version and the upstream one is: we
> switched to vfork() for the obvious performance reasons. But since
> vfork() is scary we implemented the exec-twice-trick with a little
> in-between binary. The main purpose was to minimize the window between
> vfork() and the first exec(), where we operate inside the parent
> process memory. Everything "dangerous" - calling libc functions,
> scanning the proc file system, error handling, tracing etc - is
> deferred to after the first exec. And since that window is minimal,
> that also reduces the chance of signals in the child process harming
> the parent.
>
>
See discussion in src/java.base/unix/native/libjava/ProcessImpl_md.c

> Later Oracle introduced something very similar with the jspawnhelper.
> But I see that jspawnhelper is not used at all for the vfork() case,
> just for posix_spawn(), yes?
>

Implemented by other folks.  I was mostly happy with vfork.  I forget the
motivation to introduce  jspawnhelper, but I'm sure it's in the fossil
record somewhere!