RFR 8027434: "-XX:OnOutOfMemoryError" uses fork instead of vfork

Thu Sep 27 19:31:28 UTC 2018

Hi Florian,

On Thu, Sep 27, 2018 at 9:08 PM, Florian Weimer <fw at deneb.enyo.de> wrote:
> * David Holmes:
>
>> I think you may be missing my point. We take a signal that will
>> terminate the VM, and from the signal handler context start the error
>> reporting and as part of that try to fork_and_exec the onError command
>> requested by the user. My recollection from reading all this stuff is
>> that vfork may not be safe to call from a signal context.
>
> On GNU/Linux, fork is not safe, either:
>
>   <https://access.redhat.com/articles/2921161>
>
> This is particularly visible if you've got a SIGSEGV handler that is
> called from malloc after heap corruption.  Then fork will most likely
> hang due to a self-deadlock.

Yikes, I never thought about fork handlers.

Incidentally, we have a kind-of-self-healing mechanism in error
reporting, where error reporting is done in steps and if one steps
takes too long it gets interrupted via signal and we continue with the
next step. This mechanism still does not cover onError, but it could
be made to do so. Wonder whether that would be useful.

(Is onError even used much?)

>
> vfork might actually be safer in this context because it does not run
> any fork handlers, but only if you take the required measures to use
> it safely.  On the other hand, the rest of the process will keep
> running, so you don't get the snapshot functionality that comes from
> fork.
>
> In my opinion (reflected in the cited article), crash handlers should
> live outside the process, at least on Linux.

I do not think anyone uses this mechanism to get cores, or? The VM
will print an hs-err file, then abort with core. Usually that core is
still "close enough" to the crash cause to be useful - especially
since the error handler never unwinds the stack.

Core files are just not practical, at least not for us (SAP). We rely
heavily on hs-err files in our daily support life, and those are good
enough for us to analyse the majority of issues with a minimum of
fuss.

>
>> That's why I suggested we only switch this for the OnOutOfMemoryError
>> case as it's not (normally?) executed from a signal context.
>
> If it's a native OnOutOfMemoryError (is there such a thing?), then
> both vfork and fork are likely to fail themselves (but fork is more
> likely to do so).

..Thomas