[BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be exited by java

David Holmes david.holmes at oracle.com
Thu Oct 26 02:12:34 UTC 2017


Filed:

https://bugs.openjdk.java.net/browse/JDK-8190187

"JVM options that cause the VM to "do X then exit" are incompatible with 
the JNI Invocation API"

David

On 26/10/2017 1:23 AM, Adam Farley8 wrote:
> Thanks David :)
> 
> Please let me know what the bug number is when you have time.
> 
> Everyone else: Any volunteers to champion this?
> 
> I'm on IRC 9-5 from monday-friday if anyone wants to discuss it.
> 
> Best Regards
> 
> Adam Farley
> 
> P.S. Here's the summary including David's addition, minus the chevrons:
> 
> 1) Exit(0) (during VM startup) is a big bug because it imitates 
> successful completion of external cpp code accessing JNI methods.
> 2) One solution for (1) is to specify a new return code for JNI.
> 3) The supplied code (diff) generates, facilitates, and handles that 
> return code for the exit(0) scenario: -agentlib:jdwp=help
> 4) The exit(0) problem (in general) is worth fixing, however we may 
> choose not to support the use of this new code in the jdwp example case.
> 5) The supplied test confirms that the supplied code works (run via 
> unzip, and then bash TestStart.sh <path to jdk home dir that contains 
> bin dir>).
> 6) To implement this new return code, plus the code that handles it, we 
> would need to follow the CSR process to modify the JNI spec.
> 7) To implement the jdwp scenario fix, if we choose to support it at 
> all, we would also need to use the CSR process the JVM TI spec.
> 8) To address all of the worst instances of exit(0), we would need to 
> search for exit(0) and raise a bug for each significant one (or group).
> 9) To solve (8) in one bug would be a lot of work, arguably too much 
> work for a single bug.
> 10) If the new return code is identified as the appropriate solution to 
> this problem, we will need to agree on the right name and return code 
> value.
> 
> Also, here's a shortlist of the main questions I recall being raised 
> here, plus answers people have given.
> 
> A) What are the potential solutions for the exit(0) problem?
> i) New JNI Return Code.
> ii) Remove the info-only options, at least via the JNI, and return a 
> standard error code if they are used.
> iii) Remove the info-only options, at least via the JNI, and filter them 
> out if they are used.
> iv) Retain existing behaviour, and document a need for the user to 
> filter out help options before starting the VM via the JNI.
> 
> B) What are the criteria for the "Best" solution?
> i) It must prevent "exit(0)" calls.
> ii) It must be proven to work.
> iii) It should require minimal (none, ideally) behaviour change from the 
> java.exe user.
> iv) It should allow the external cpp code accessing the JNI to complete 
> normally, without being prematurely terminated.
> 
> C) Which solutions meet the (B) criteria?
> i) Both the "new return code" and the "remove info-only options" 
> solutions meet the (B) criteria.
> 
> D) Is it right to have any "Do X and then exit." arguments at all, and 
> (if no) what would be the alternatives?
>     i) Given the VM is a loadable shared library the answer should be 
> No. We should not have any "Do X and then exit"
>      arguments. The alternatives would be a mix of:
>         - added Dcmds to "Do X"
>         - added launcher smarts to recognize options that need a Dcmd 
> and "then exit".
> 
>         Note from David:
>      I find the notion of "help" options problematic for a shared 
> library. But I don't have a good answer for how else
>      to document things.
> 
>         That all said I think this ship has sailed and we're unlikely to 
> want to invest the time and effort in trying to clean
>      this up this way.
> 
> 
> 
> From: David Holmes <david.holmes at oracle.com>
> To: Adam Farley8 <adam.farley at uk.ibm.com>
> Cc: Alan Bateman <Alan.Bateman at oracle.com>, 
> core-libs-dev at openjdk.java.net, hotspot-runtime-dev at openjdk.java.net, 
> thomas.stuefe at gmail.com
> Date: 25/10/2017 13:53
> Subject: Re: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM can be 
> exited by java
> ------------------------------------------------------------------------
> 
> 
> 
> Hi Adam,
> 
> On 18/10/2017 9:17 PM, Adam Farley8 wrote:
>> Hi All,
>> 
>> Updated summary, based on the responses:
> 
> I think this is a good summary. Thanks.
> 
> I can file a bug for this, but it will take some work to see how to fit
> this into the existing specifications and file CSR requests for those
> changes. This won't make 18.3 (or whatever it gets called). You will
> need a "champion" to help flesh this out in full and work with you on
> those CSR requests. I can't volunteer to do that at this time.
> 
>> 1) Exit(0) (during VM startup) is a big bug because it imitates 
>> successful completion of external cpp code accessing JNI methods.
>> 2) One solution for (1) is to specify a new return code for JNI.
>> 3) The supplied code (diff) generates, facilitates, and handles that
>> return code for the exit(0) scenario: -agentlib:jdwp=help
>> 4) The exit(0) problem (in general) is worth fixing, however we may
>> choose not to support the use of this new code in the jdwp example  case.
>> 5) The supplied test confirms that the supplied code works (run via
>> unzip, and then bash TestStart.sh <path to jdk home dir that contains
>> bin dir>).
>> 6) To implement this new return code, plus the code that handles it,  we
>> would need to follow the CSR process to modify the JNI spec.
>> 7) To implement the jdwp scenario fix, if we choose to support it  at
>> all, we would also need to use the CSR process the JVM TI spec.
>> 8) To address all of the worst instances of exit(0), we would need  to
>> search for exit(0) and raise a bug for each significant one (or group).
>> 9) To solve (8) in one bug would be a lot of work, arguably too much
>> work for a single bug.
>> 10) If the new return code is identified as the appropriate solution  to
>> this problem, we will need to agree on the right name and return code
>> value.
>> 
>> Also, here's a shortlist of the main questions I recall being raised
>> here, plus answers people have given.
>> 
>> A) What are the potential solutions for the exit(0) problem?
>>      i) New JNI Return Code.
>>      ii) Remove the info-only options, at least via  the JNI, and return
>> a standard error code if they are used.
>>      iii) Remove the info-only options, at least via  the JNI, and filter
>> them out if they are used.
>>      iv) Retain existing behaviour, and document a  need for the user to
>> filter out help options before starting the VM via the JNI.
>> 
>> B) What are the criteria for the "Best" solution?
>>      i) It must prevent "exit(0)" calls.
>>      ii) It must be proven to work.
>>      iii) It should require minimal (none, ideally)  behaviour change
>> from the java.exe user.
>>      iv) It should allow the external cpp code accessing  the JNI to
>> complete normally, without being prematurely terminated.
>> 
>> C) Which solutions meet the (B) criteria?
>>      i) Both the "new return code" and the  "remove info-only options"
>> solutions meet the (B) criteria.
>> 
>> D) Is it right to have any "Do X and then exit." arguments  at all, and
>> (if no) what would be the alternatives?
>>     i) ?
> 
> Given the VM is a loadable shared library the answer should be No. We
> should not have any "Do X and then exit" arguments. The alternatives
> would be a mix of:
> - added Dcmds to "Do X"
> - added launcher smarts to recognize options that need a Dcmd and "then
> exit".
> 
> I find the notion of "help" options problematic for a shared library.
> But I don't have a good answer for how else to document things.
> 
> That all said I think this ship has sailed and we're unlikely to want to
> invest the time and effort in trying to clean this up this way.
> 
> Thanks,
> David
> 
>> Best Regards
>> 
>> Adam Farley
>> 
>> P.S. As per Alan and David's emails, the exit(#) references have been
>> removed entirely, as discussing them alongside the original exit(0)
>> problem risks scope creep.
>> 
>> This bug, if raised, should only cover the exit(0) cases. I believe  we
>> have consensus here.
>> 
>> 
>> 
>> From: David Holmes <david.holmes at oracle.com>
>> To: Adam Farley8 <adam.farley at uk.ibm.com>, Alan Bateman 
>> <Alan.Bateman at oracle.com>, core-libs-dev at openjdk.java.net, 
>> hotspot-runtime-dev at openjdk.java.net, thomas.stuefe at gmail.com
>> Date: 13/10/2017 14:31
>> Subject: Re: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM  can be
>> exited by java
>> ------------------------------------------------------------------------
>> 
>> 
>> 
>> Hi Adam,
>> 
>> On 13/10/2017 10:16 PM, Adam Farley8 wrote:
>>  > Hi All,
>>  >
>>  > Here's a summary of the email below (which is intended,  partly, as a
>>  > summary of the emails before it).
>>  >
>>  > Let me know if you agree/disagree with any of these points.
>>  >
>>  > 1) Exit(#) during vm startup is a bug because it should  return a code
>>  > regardless of the state of the VM.
>> 
>> Yes it's a bug but not one that is likely to be addressed in any
>> foreseeable timeframe. There are simply too many "exit on error"  paths.
>> If we were to start using C++ exceptions within the VM that might
>> provide a way to quickly get back to the CreateJavaVM routine where  we
>> could return an error code - but that is itself a major project that  has
>> barely even been discussed AFAIK. (Compiler folk have talked about  it
>> because compiler paths are fairly self-contained - though that was
>> before Graal and AOT.)
>> 
>>  > 2) Exit(0) is an *especially* big bug because it imitates  successful
>>  > completion of external cpp code accessing JNI methods.
>>  > 3) One solution is to specify a new return code for JNI.
>> 
>> A solution for 2) yes.
>> 
>>  > 4) The supplied code (diff) generates, facilitates, and  handles that
>>  > return code for the exit(0) scenario: -agentlib:jdwp=help
>>  > 5) The supplied test confirms that the supplied code works  (run via
>>  > unzip, and then bash TestStart.sh <path to jdk home  dir that contains
>>  > bin dir>).
>>  > 6) To implement this new return code, plus the code that  handles it, we
>>  > would need to follow the CSR process.
>>  > 7) To implement the fix for the scenario used as an example  of the new
>>  > return code's use, we would need to modify the JVM TI spec.
>> 
>> Yes you have demonstrated a potential solution for the agent case.  The
>> question is, is it the right solution? Is it a worthwhile solution?  (As
>> I've said I'd prefer not to have any "do something then exit"  VM
>> arguments.) And can we make it fit into the existing specs without
>> contorting things too much.
>> 
>> I still think it easier/preferable for whatever loads the VM to filter
>> out the VM args that trigger this behaviour. I mean if you pass
>> -Xshare:dump you don't have any right to expect a functioning VM after
>> JNI_CreateJavaVM returns - at least I don't think so. Just don't do  it.
>> Yes it is an imperfection in the invocation API, but life isn't perfect.
>> 
>>  > 8) To address all of the worst instances of exit(#), we  would need to
>>  > search for exit(#) and raise a bug for each significant  one (or group).
>>  > 9) To solve (8) in one bug would be a lot of work, arguably  too much
>>  > work for a single bug.
>> 
>> This is simply impractical. You may be able to pick off a few
>> low-hanging cases, but that won't really make any practical difference.
>> 
>>  > 10) If the new return code is chosen as the appropriate  solution to this
>>  > problem, we may need to choose a better name for the return  code.
>>  >
>>  > Is this a fair assessment of the current state of the debate?
>> 
>> It's a fair summary of your position and proposal.
>> 
>> Cheers,
>> David
>> 
>>  > I'm on IRC every weekday from 9am-5pm (4pm fridays) BST  (GMT+1) if
>>  > anyone wants to discuss this in real-time on the openjdk  channel.
>>  >
>>  > Best Regards
>>  >
>>  > Adam Farley
>>  >
>>  >
>>  >
>>  > -- Previous Email --
>>  >
>>  > Hi David, Alan,
>>  >
>>  > You are right in that the changes to HotSpot would be nontrivial.
>>  > I see a number of places in (e.g.) arguments.cpp that seem  to
>>  > exit in the same manner as Xlog (such as -Xinternalversion).
>>  >
>>  > I would advise ploughing through the CSR process to alter  the
>>  > JNI spec, and simultaneously identify some key paths that  can
>>  > be raised as bugs. That way, when people have time to address
>>  > these issues, the mechanism to handle a silent exit is  already
>>  > in place.
>>  >
>>  > The JDWP fix can be raised separately as one of these bugs,  if
>>  > it would make things simpler.
>>  >
>>  > As for the name, JNI_SILENT_EXIT is a placeholder, and  can be
>>  > readily changed. Do you have any suggestions?
>>  >
>>  > Lastly, in an ideal world, the VM initialisation should  never exit(#). It
>>  > should return a return code that tells the caller something,  pass or
>>  > fail, messy or tidy. That way, if someone is using the  JNI as part of
>>  > something bigger (like a database or a web server), one  of these
>>  > scenarios is just a bug, rather than a world-ender like  exit(#).
>>  >
>>  > And now for the individual messages. :)
>>  >
>>  > David: Having help data returned by the launcher seems  like a
>>  > good way to avoid exit(0) calls, but I'm not sure how we'd  prevent
>>  > a JNI-caller using those options. Ultimately, to be sure,  we'd have
>>  > to remove the logic for those options, centralise the data  to better
>>  > enable launcher access, and add some logic in there so  it can find
>>  > any other help data (e.g. from the jdwp agent library).  I feel this would
>>  > be a bigger task than adding the new return code and changing  the
>>  > vm, plus it wouldn't provide for any non-help scenarios  where the
>>  > vm wants to shut down without error during initialisation.
>>  >
>>  > Alan: I should mention that the silent exit solution is  already in
>>  > use in the OpenJ9 VM. Not all of the exit paths have been
>>  > resolved, but many have.
>>  >
>>  > The code is open and can be found here: 
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_eclipse_openj9&d=DwIC-g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=c_McLiDSKtlzF_gNlWcAvBTNbDqyGHyW325GY3_3QkU&s=0QVowxRejNRAXI0Vv5whKQFWGTO36XVICmPYCG8EqIU&e=>
>>  > 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_eclipse_openj9&d=DwIC-g&c=jf_iaSHvJObTbx-siA1ZOg&r=P5m8KWUXJf-CeVJc0hDGD9AQ2LkcXDC0PMV9ntVw5Ho&m=c_McLiDSKtlzF_gNlWcAvBTNbDqyGHyW325GY3_3QkU&s=0QVowxRejNRAXI0Vv5whKQFWGTO36XVICmPYCG8EqIU&e=
>>  >
>>  > And though the silent exit code is disabled for the time  being, it
>>  > can be re-enabled by entering this class:
>>  >
>>  > runtime/vm/jvminit.c
>>  >
>>  > and altering line 2343 ( ctrl-f for exit(1) if it's not  there).
>>  >
>>  > I won't paste the full code here in case people are concerned
>>  > about contamination, but I would assert that this code  (and the
>>  > associated vm files) prove that the concept is possible.
>>  >
>>  > Note that that code should not be enabled until after we've
>>  > integrated the code that can handle a silent exit.
>>  >
>>  > Best Regards
>>  >
>>  > Adam Farley
>>  >
>>  > P.S. Thank you both for your efforts on this. :)
>>  >
>>  >
>>  >
>>  > From: David Holmes <david.holmes at oracle.com>
>>  > To: Alan Bateman <Alan.Bateman at oracle.com>, Adam  Farley8
>>  > <adam.farley at uk.ibm.com>, core-libs-dev at openjdk.java.net,
>>  > hotspot-runtime-dev at openjdk.java.net, thomas.stuefe at gmail.com
>>  > Date: 15/09/2017 12:03
>>  > Subject: Re: [BUG PROPOSAL]: C++ code that calls JNI_CreateJavaVM  can be
>>  > exited by java
>>  > ------------------------------------------------------------------------
>>  >
>>  >
>>  >
>>  >
>>  >
>>  > On 15/09/2017 8:17 PM, Alan Bateman wrote:
>>  >  > On 15/09/2017 02:47, David Holmes wrote:
>>  >  >> Hi Adam,
>>  >  >>
>>  >  >> I am still very much torn over this one.  I think the idea of
>>  >  >> print-and-exit flags for a potentially hosted  library like the JVM is
>>  >  >> just wrong - we should never have done that,  but we did. Fixing that
>>  >  >> by moving the flags to the launcher is far  from trivial**. Endorsing
>>  >  >> and encouraging these sorts of flag by adding  JNI support seems to be
>>  >  >> sending the wrong message.
>>  >  >>
>>  >  >> ** I can envisage a "help xxx"  Dcmd that can read back the info from
>>  >  >> the VM. The launcher can send the Dcmd,  print the output and
>> exit. The
>>  >  >> launcher would not need to know what the  xxx values mean, but would
>>  >  >> have to intercept the existing ones.
>>  >  >>
>>  >  >> Another option is just to be aware of these  flags (are there more
>> than
>>  >  >> jdwp and Xlog?) and deal with them specially  in your custom
>> launcher -
>>  >  >> either filter them out and ignore them,  or else launch the VM in its
>>  >  >> own process to respond to them.
>>  >  >>
>>  >  >> Any changes to the JNI specification need  to go through the CSR
>> process.
>>  >  > Yes, it would require an update to the JNI spec,  also a change to the
>>  >  > JVM TI spec where Agent_OnLoad returning a non-0  value is specified to
>>  >  > terminates the VM. The name and value needs  discussion too, esp.
>> as the
>>  >  > JNI spec uses negative values for failure.
>>  >  >
>>  >  > In any case, I'm also torn over this one as  it's a corner case that is
>>  >  > only interesting for custom launchers that load  agents with
>> options that
>>  >  > print usage messages. It wouldn't be hard to  have the Agent_OnLoad
>>  >  > specify a printf hook that the agent could use  for output although
>> there
>>  >  > are complications with agents such as JDWP that  also announce their
>>  >  > transport end point. Beyond that there is still  the issue of the
>> custom
>>  >  > launcher that would need to know to destroy  the VM without
>> reporting an
>>  >  > error.
>>  >  >
>>  >  > So what happened to the more meaty part to this  which is fixing the
>>  >  > various cases in HotSpot that terminate the  process during
>>  >  > initialization? I would expect some progress  could be made on those
>>  >  > cases while trying to decide whether to rev  the JNI and JVM TI
>> specs to
>>  >  > cover the help case.
>>  >
>>  > Trying to eliminate the vm_exit_during_initialization paths  in hotspot
>>  > is a huge undertaking IMHO.
>>  >
>>  > David
>>  >
>>  >  >
>>  >  > -Alan
>>  >
>>  > Unless stated otherwise above:
>>  > IBM United Kingdom Limited - Registered in England and  Wales with number
>>  > 741598.
>>  > Registered office: PO Box 41, North Harbour, Portsmouth,  Hampshire
>> PO6 3AU
>> 
>> 
>> 
>> 
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with  number
>> 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire  PO6 3AU
> 
> 
> 
> 
> 
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 
> 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU


More information about the hotspot-runtime-dev mailing list