RFR: JDK-8155004: CrashOnOutOfMemoryError doesn't work for OOM caused by inability to create threads
Thomas Stuefe
stuefe at openjdk.java.net
Wed Apr 21 04:19:03 UTC 2021
On Tue, 20 Apr 2021 14:50:21 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> Greetings,
>>
>> this is an old issue and I'd like to fix it. If we fail to create a java thread due to platform limitations we throw an OOM. But we fail to honor the various xxxOnOutOfMemoryError switches.
>>
>> The fix is very straightforward.
>>
>> If fixes
>> - CrashOnOutOfMemoryError
>> - ExitOnOutOfMemoryError
>> - HeapDumpOnOutOfMemoryError
>> - and, in theory "OnOutOfMemoryError=<user comand>".
>>
>> the latter only in theory because most of the times whatever prevented the thread to start up will also prevent the fork needed to get the user command running.
>>
>> One remaining question, maybe for a future RFE, is how we want to handle native threads creation error. AFAICS currently, failing to create a native thread may or may not result in a fatal shutdown, a log output, or just be ignored, depending on the thread.
>>
>> If `...OnOutOfMemoryError` is specified, should native thread creation failure be handled the same way as a java thread?
>>
>> - No if I take the option name literally, since there is no OOM involved
>> - Yes if I take into account what these switches are actually used for - analysis or quick shutdown of a JVM inside a container in case of an OOM. Since it is completely random which thread is hit by the limit.
>>
>> Thanks, Thomas
>
> p.s. I do not understand your quotes around "bug", nor your hesitance to fix this. I have seen quite a number of desperate setups customers have around crashy or OOMy JVMs, to analyse and to restart quickly. Anything in that area we can do helps.
>
> One example, take a look at the way CloudFoundry handles OOMs in a JVM. They have a JVMTI agent hooked up to resource-exhausted, then attempt to run analysis code (written in Java too) to dump the heap. This is so fragile.
>
> xxxOnOutOfMemoryError could really help in these scenarios, if it would function reliably.
> Mostly agree with @tstuefe . I give +1 to this PR. I also have seen OOMs due to native thread creation, then I advised to watch the log to reboot the system quickly.
> (I've provided HeapStats to my customers to do it - it is JVMTI agent and similar way to CloudFoundry's agent)
>
> However I want to know why JVM ignores it now. Because JVM cannot work correctly if the error is caused by ENOMEM?
Its simply not implemented. The ...OnOutOfMemoryError handling needs to be added to places where OOM is thrown, and is missing from a couple of places.
-------------
PR: https://git.openjdk.java.net/jdk/pull/3586
More information about the hotspot-dev
mailing list