RFR: JDK-8155004: CrashOnOutOfMemoryError doesn't work for OOM caused by inability to create threads

Yasumasa Suenaga ysuenaga at openjdk.java.net
Tue Apr 20 23:56:06 UTC 2021


On Tue, 20 Apr 2021 14:50:21 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Greetings,
>> 
>> this is an old issue and I'd like to fix it. If we fail to create a java thread due to platform limitations we throw an OOM. But we fail to honor the various xxxOnOutOfMemoryError switches.
>> 
>> The fix is very straightforward. 
>> 
>> If fixes 
>> - CrashOnOutOfMemoryError
>> - ExitOnOutOfMemoryError
>> - HeapDumpOnOutOfMemoryError
>> - and, in theory "OnOutOfMemoryError=<user comand>".
>> 
>> the latter only in theory because most of the times whatever prevented the thread to start up will also prevent the fork needed to get the user command running.
>> 
>> One remaining question, maybe for a future RFE, is how we want to handle native threads creation error. AFAICS currently, failing to create a native thread may or may not result in a fatal shutdown, a log output, or just be ignored, depending on the thread. 
>> 
>> If `...OnOutOfMemoryError` is specified, should native thread creation failure be handled the same way as a java thread?
>> 
>> - No if I take the option name literally, since there is no OOM involved
>> - Yes if I take into account what these switches are actually used for - analysis or quick shutdown of a JVM inside a container in case of an OOM. Since it is completely random which thread is hit by the limit.
>> 
>> Thanks, Thomas
>
> p.s. I do not understand your quotes around "bug", nor your hesitance to fix this. I have seen quite a number of desperate setups customers have around crashy or OOMy JVMs, to analyse and to restart quickly. Anything in that area we can do helps. 
> 
> One example, take a look at the way CloudFoundry handles OOMs in a JVM. They have a JVMTI agent hooked up to resource-exhausted, then attempt to run analysis code (written in Java too) to dump the heap. This is so fragile.
> 
> xxxOnOutOfMemoryError could really help in these scenarios, if it would function reliably.

Mostly agree with @tstuefe . I give +1 to this PR.  I also have seen OOMs due to native thread creation, then I advised to watch the log to reboot the system quickly.
(I've provided HeapStats to my customers to do it - it is JVMTI agent and similar way to CloudFoundry's agent)

However I want to know why JVM ignores it now. Because JVM cannot work correctly if the error is caused by ENOMEM?

-------------

PR: https://git.openjdk.java.net/jdk/pull/3586


More information about the hotspot-dev mailing list