RFR(s): 8223777: In posix_spawn mode, failing to exec() jspawnhelper does not result in an error

Thomas Stüfe thomas.stuefe at gmail.com
Tue May 14 06:36:18 UTC 2019


Hi Roger, David,

second version:

Full:
http://cr.openjdk.java.net/~stuefe/webrevs/8223777-posix_spawn-no-exec-error/webrev.01/webrev/
Delta:
http://cr.openjdk.java.net/~stuefe/webrevs/8223777-posix_spawn-no-exec-error/webrev_delta.01/webrev/

Answering Rogers remarks here since that should cover Davids points too:

On Mon, May 13, 2019 at 8:44 PM Roger Riggs <Roger.Riggs at oracle.com> wrote:

> Hi Thomas,
>
> Some thoughts...
>
> Did you determine exactly how the failure of exec did not result in an
> IOException?
>

posix_spawn() succeeds. Child process is started successfully - so the
fork/vfork/clone/whatever posix_spawn() does internally goes thru
successfully. But child process dies immediately with exit code 127. This
is impossible to tell apart from cases where the target binary was launched
successfull but exited with 127. So there is no way to tell that exec()
failed. For the user it looks like Runtime.exec() was successful but his
program was very short lived.

See comments in code for more details.


> The change at 696 just prevents an exception from getting overwritten but
> not one that is not thrown or was cleared.
>

I changed that coding to return an adduitional static string in case of an
error, which gets appended to the IOException text. So, I restored the
original logic which only throws IOException at the topmost layer.


>
> An improved test for a successful spawnChild could be added to
> initialize resultPid to -1

and check that it has been overwritten with the pid.  If posix_spawn
> returned success,
> there would still be an indication that the process was or was not created.
>
>
 Done, but probably pointless. As I wrote, the child process does get
created, that is not the problem. The follow up exec fails.


> Can the function be named to indicate what is it checking?  For example,
> isHelperExecutable?
> It returns a boolean and it would be more conventional to stick to the C
> interpretation
> of 0 == true, anything else is not.
>

Yes, that is cleaner. All done.


> Requiring all three exec bits may be overkill and does not match the
> actual access test.
> It might be confusing if the permissions were set so Linux could exec
> it, but the Java runtime
> threw an exception because it had a stricter requirement (not documented).
>

David had the same concerns. Of course. Since we cannot safely predict
exec() success I rather have false negatives than false positives.

Both are bad but refusing to exec in situations where it could actually
work is worse.

Neither do I want to go down the rabbit hole and make the pre-exec test
more intricate, there was a reason Martin abandoned that approach in 2005.


> Since this is in the normal path and will very rarely fail, it is
> reasonable to ask
> about any potential performance impact.  Can the test for exec and more
> specific message
> be used only if the exec fails.
>

I dont know if exec fails, which is the whole problem :)

But I changed the coding to only do this test once. Since I find it very
unlikely that someone changes the permissions on the fly, this should be
okay.

I don't know, should it? Now I start to doubt. What do you guys think?


>
> The exception includes the literal "jspawnhelper" which would be
> misleading if it were
> changed and it might be clearer to mention 'exec' permission in the
> message.
>
>
Done. In the message I talk now about "spawn helper" and foresee many a
blog post explaining how to fix this :P


> Thanks, Roger
>

Thanks, Thomas


>
>
>
> On 05/13/2019 12:00 PM, Thomas Stüfe wrote:
> > Hi all,
> >
> > may I have please your opinions about the following change:
> >
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8223777
> > webrev:
> >
> http://cr.openjdk.java.net/~stuefe/webrevs/8223777-posix_spawn-no-exec-error/webrev.00/webrev/
> >
> > In short, since some weeks posix_spawn() is used on Linux to spawn
> childs.
> > But the glibc implementation of posix_spawn does not report errors back
> to
> > the caller if exec()ing the target binary fails.
> >
> > We only ever exec the jspawnhelper, which is part of the JDK and which
> then
> > exec()s the real target binary. The second exec() is under our control
> and
> > Martin did provide proper error reporting.
> >
> > However, the first exec(), performed inside posix_spawn(), may fail too.
> > Most posix_spawn implementations will tell us if that happens, not glibc
> > (it is cumbersome to implement, one needs a pipe and somesuch, and POSIX
> > leaves a bit of headroom for the implementor to cheap out, so they did.)
> >
> > So, if someone messes with the JDK installation and e.g. removes the
> > execute bits from jspawnhelper, that first exec() fails, but we get no
> > error. From the outside it looks like Runtime.exec() succeeded.
> >
> > This is impossible to fix completely - see Martin's comment about the
> > impossibility to foresee whether an exec() will succeed without actually
> > exec()ing. But at least we can test the execute permissions on the
> > jspawnhelper. Which this fix does. This fixes Ubuntu 16.4 (Now, I get an
> > IOException if jspawnhelper is not executable).
> >
> > Since I like to be super careful with runtime.exec coding, I will run
> this
> > through our nightlies and also would like feedback from the core libs
> > people. Also, I am not sure about the runtime costs of this hack. Seems a
> > bit excessive for something which is just a setup error (a correctly
> > installed JDK should never have this problem).
> >
> > As a side note, it may help if jspawnhelper would live in the bin folder
> of
> > the JDK (currently lives in lib), because everyone should be aware that
> > programs in bin need execute permissions...
> >
> > Cheers, Thomas
>
>


More information about the core-libs-dev mailing list