Why no hs-err file on CheckJNI?

Wed Sep 1 02:35:22 UTC 2021

On 8/25/21 9:32 PM, Thomas Stüfe wrote:
> On Wed, Aug 25, 2021 at 9:28 AM David Holmes <david.holmes at oracle.com>
> wrote:
>
>> On 25/08/2021 4:04 pm, Thomas Stüfe wrote:
>>> Hi David,
>>>
>>> thank you for looking at this. Answers below.
>>>
>>> On Tue, Aug 24, 2021 at 9:38 AM David Holmes <david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>> wrote:
>>>
>>>      Hi Thomas,
>>>
>>>      On 24/08/2021 12:27 am, Thomas Stüfe wrote:
>>>       > Hi,
>>>       >
>>>       > when we specify CheckJNI or CheckJNICalls and we catch an error
>>>      (e.g. a
>>>       > memory overwriter), we write a short report, then abort. See:
>>>       >
>>>       >
>>>
>> https://github.com/openjdk/jdk/blob/594e5161b48382d61509b4969bc8f52c3c076452/src/hotspot/share/prims/jniCheck.hpp#L36-L41
>>>      <
>> https://github.com/openjdk/jdk/blob/594e5161b48382d61509b4969bc8f52c3c076452/src/hotspot/share/prims/jniCheck.hpp#L36-L41
>>>       >
>>>       > This has been introduced in 2008 with JDK-6739363 "Xcheck jni
>>>      doesn't check
>>>       > native function arguments". I could find no discussion about this
>> on
>>>       > mailing list archives.
>>>
>>>      There have been a number of updates to Xcheck:jni since then and in
>>>      17 I
>>>      documented the different kinds of checks and their behaviour in more
>>>      detail (JDK-8260194):
>>>
>>>
>> https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/4f336dd3985b654dc3fbacabdcfccf590ea918e5/java.html
>>>      <
>> https://htmlpreview.github.io/?https://github.com/openjdk/jdk17/blob/4f336dd3985b654dc3fbacabdcfccf590ea918e5/java.html
>>>
>>>
>>> Nice and interesting. Does not mention buffer overruns though.
>> Do we detect buffer overruns? I looked at all the jniCheck functions to
>> see what things we checked for and thought I had found them all. :(
>>
>>>       > Does anyone know why we don't write a normal hs-err file in this
>>>      case?
>>>
>>>      Because the intent is to mimic throwing an exception and exiting and
>> it
>>>      is not a "hotspot error" it is an application error.
>>>
>>>       > Would anyone care if we did? We do so in similar cases, e.g. if
>>>      os::free()
>>>       > catches an overwrite.
>>>
>>>      os::free() is capturing an internal hotspot programming error, not an
>>>      error in user code.
>>>
>>>
>>> Is this mainly a support issue for you? Meaning, the existence of an
>>> hs-err file would indicate a hotspot error and third-party JNI errors
>>> erroneously assigned to the hotspot group's support queue? If so, I can
>>> understand that, though that separation has a lot of holes in practice
>>> (it's very easy to make the hotspot crash from third-party code).
>>>
>>> Technically, a hs-err file would be useful even if most of the hotspot
>>> internals are irrelevant for a JNI bug. The file contains a lot of
>>> valuable context.
>> I just don't think a "hotspot error file" is a reasonable or necessary
>> response to detecting a JNI error in application code. A stacktrace
>> should suffice for the vast majority of errors detected.
>>
>>>      You would need to rework the header error messages etc and remove the
>>>      bug reporting stuff so that the user doesn't think it is an error in
>>>      the
>>>      VM itself. Overall I don't see the need to do it as the main thing is
>>>      the stacktrace to see where the bad JNI usage occurred - and as I
>> said
>>>      this isn't a VM error.
>>>
>>>      It might also introduce compatibility issues for anyone who runs
>>>      testing
>>>      wiith -Xcheck:jni and doesn't expect to get the hs_err file - though
>> if
>>>      you keep the current output but also produce a modified hs_err file
>>>      that
>>>      may be okay. But I still question why you would need this?
>>>
>>>
>>> I am currently investigating a buffer overrun at a client caused in
>>> ReleaseByteArrayElements. A hs-err file would have been definitely
>> useful.
>>
>> I need more info on this case. If the overrun was detected when it
>> happened then I would hope a stacktrace would suffice to show the errant
>> code. And I'm not clear how a hs_err file would help. ??
>>
>>
> A hs-err file would give you context information beyond the stack, e.g. VM
> and program arguments, runtime, memory content at register/stack addresses
> (may contain the broken block) etc.
>
> A hs-err file is also a clear sign for a fatal error, while output to
> stderr may be accidentally ignored. It depends on how well versed your
> first level support and your customers are.
>
> But I don't like pressing this. I understand your reasons for not writing
> an hs-err file. We may change this downstream only, the patch would be
> really trivial.

What about a new option like -XX:+LogJNIChecks which writes a 
jni_check_log_<pid>.log file?

This way you get more information about the check, and it's clear that 
this is not a hotspot error.

Thanks

- Ioi