RFC: call report_java_out_of_memory_error() for -XX:AbortVMOnException=java.lang.OutOfMemoryError
Volker Simonis
volker.simonis at gmail.com
Wed Sep 8 17:35:49 UTC 2021
I'm not sure if running a jcmd process which attaches to the dying VM
as part of the OnError scripts is a use case we really want to
support?
There's a reason why the VM is crashing and attaching to this dying VM
will most probably only cause other follow-up errors.
On Wed, Sep 8, 2021 at 7:22 PM Liu, Xin <xxinliu at amazon.com> wrote:
>
> Hi, David,
>
> Thanks for the head-up. yes, it works for me.
>
> There's one more thing. One drawback is that the script providing to
> OnError can't trap hotspot itself or we end up with a deadlock.
>
>
> If we use 'jcmd %p Thread.print' or 'jcmd %p GC.heap_dump <file>' in
> OnError=, (%p means the java process itself), the main java thread which
> is waiting for os::fork_and_exec(cmd) will prevent hotspot reach to the
> safepoint. It's deadlock because no safepoint mean fork_and_exec can't
> complete.
>
> eg.
> $java -Xmx50m -XX:AbortVMOnException=java.lang.OutOfMemoryError
> -XX:OnError='jcmd %p Thread.print' -XX:+SafepointTimeout OomDumpExample
> direct
> # To suppress the following error report, specify this argument
> # after -XX: or in .hotspotrc: SuppressErrorAt=/exceptions.cpp:541
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # Internal Error
> (/home/xxinliu/Devel/jdk/src/hotspot/share/utilities/exceptions.cpp:541), pid=107552,
> tid=107553
> # fatal error: Saw java.lang.OutOfMemoryError, aborting
> #
> # JRE version: OpenJDK Runtime Environment (18.0) (slowdebug build
> 18-internal+0-adhoc.xxinliu.jdk)
> # Java VM: OpenJDK 64-Bit Server VM (slowdebug
> 18-internal+0-adhoc.xxinliu.jdk, mixed mode, tiered, compressed oops,
> compressed class ptrs, g1 gc, linux-amd64)
> # Problematic frame:
> # V [libjvm.so+0x924e8c] Exceptions::debug_check_abort(char const*,
> char const*)+0x8a
> #
> # No core dump will be written. Core dumps have been disabled. To enable
> core dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /local/home/xxinliu/JDK-2085/hs_err_pid107552.log
> #
> # If you would like to submit a bug report, please visit:
> # https://bugreport.java.com/bugreport/crash.jsp
> #
> #
> # -XX:OnError="jcmd %p Thread.print"
> # Executing /bin/sh -c "jcmd 107552 Thread.print" ...
> 107552:
> [13.045s][warning][safepoint]
> [13.045s][warning][safepoint] # SafepointSynchronize::begin: Timeout
> detected:
> [13.045s][warning][safepoint] # SafepointSynchronize::begin: Timed out
> while spinning to reach a safepoint.
> [13.045s][warning][safepoint] # SafepointSynchronize::begin: Threads
> which did not reach the safepoint:
> [13.045s][warning][safepoint] # "main" #1 prio=5 os_prio=0 cpu=1552.12ms
> elapsed=13.04s tid=0x00007f43600278e0 nid=107553 runnable
> [0x00007f4369d9f000]
> [13.045s][warning][safepoint] java.lang.Thread.State: RUNNABLE
> [13.045s][warning][safepoint] Thread: 0x00007f43600278e0 [0x1a421]
> State: _running _at_poll_safepoint 0
> [13.045s][warning][safepoint] JavaThread state: _thread_in_vm
> [13.045s][warning][safepoint]
> [13.045s][warning][safepoint] # SafepointSynchronize::begin: (End of list)
>
>
> I haven't figured out how yet, but I think I can lift this constraint.
> Once I did, OnError would have more freedom to dump thread or heap
> before dieing. Can I file bug about this?
>
> thanks,
> --lx
>
>
> On 8/30/21 9:26 PM, David Holmes wrote:
> > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> >
> >
> >
> > Hi,
> >
> > On 28/08/2021 4:54 am, Liu, Xin wrote:
> >> Hi,
> >>
> >> Recently I revisit JDK-8155004/JDK-8257790 because a new team trip over.
> >> -XX:AbortVMOnException=java.lang.OutOfMemoryError works. I wonder
> >> whether it is a good idea to call report_java_out_of_memory_error() when
> >> OOME is trapped. In this way, HotSpot will trigger OnOutOfMemoryError
> >> callbacks.
> >
> > Why not just use AbortVMOnException together with OnError to get the
> > callbacks?
> >
> > Cheers,
> > David
> >
> >> I understand JDK-8257790 is not a bug. I don't want to overthrow that
> >> conclusion. I just wonder if we can handle it better in the presence of
> >> -XX:AbortVMOnException=java.lang.OutOfMemoryError.
> >>
> >> For Java webservers, OOME may lead to a zombie process. We may have a
> >> bug in code or indeed run out of memory. OOME is suppressed or terminate
> >> the thread but don't terminate the java process. eg.
> >>
> >> public class Main {
> >> volatile static boolean done = false;
> >>
> >> public static void main(String[] args) {
> >> String msg = "a long long message.";
> >> // write your code here
> >> Runnable runnable = () -> {
> >> int cnt = Integer.MAX_VALUE / msg.length() + 1;
> >> //it will throw a OutOfMemoryError.
> >> msg.repeat(cnt);
> >> done = true;
> >> };
> >>
> >> Thread thread = new Thread(runnable);
> >> thread.start();
> >> while(!done) {
> >> } // this simulates the main loop of event handling
> >> }
> >> }
> >>
> >> Java developers can use
> >> -XX:AbortVMOnException=java.lang.OutOfMemoryError to exercise fail-fast
> >> principle. Java web application which handle traffics are usually
> >> distributed in a cluster. A failure of a single host usually is not a
> >> big deal. As long as java exits, it's easy to restart and backfill it.
> >>
> >> My proposing change is very simple. Just call
> >> report_java_out_of_memory() if value_string is OOME. It's no-op if users
> >> never specify anything. If they do specify flags like
> >> Crash/ExitOnOutOfMemory, OnOutOfMemoryError or
> >> HeapDumpOnOutOfMemoryError, HotSpot will let report_java_out_of_memory
> >> does the cleanup job. fatal() works but too brutal. I think we should
> >> let java exits with error code.
> >>
> >>
> >> diff --git a/src/hotspot/share/utilities/exceptions.cpp
> >> b/src/hotspot/share/utilities/exceptions.cpp
> >> index bd95b8306be..fd8a83deaf3 100644
> >> --- a/src/hotspot/share/utilities/exceptions.cpp
> >> +++ b/src/hotspot/share/utilities/exceptions.cpp
> >> @@ -538,6 +538,9 @@ void Exceptions::debug_check_abort(const char
> >> *value_string, const char* message
> >> strstr(value_string, AbortVMOnException)) {
> >> if (AbortVMOnExceptionMessage == NULL || (message != NULL &&
> >> strstr(message, AbortVMOnExceptionMessage))) {
> >> + if(!strcmp(value_string, "java.lang.OutOfMemoryError")) {
> >> + report_java_out_of_memory(message);
> >> + }
> >> fatal("Saw %s, aborting", value_string);
> >> }
> >> }
> >>
> >>
> >> thanks,
> >> --lx
> >>
More information about the hotspot-runtime-dev
mailing list