Emergency JFR dump at OOME
Yasumasa Suenaga
yasuenag at gmail.com
Tue Oct 30 01:34:52 UTC 2018
Hi Markus,
I confirmed with GDB that Leak Profiler is called by shutdown hook.
I think it is very useful to obtain information when OOME occurs
because the user might not get heap dump.
So I want JFR to call Leak Profiler when OOME occurs before being
destroyed problematic thread.
Thanks,
Yasumasa
2018年10月29日(月) 23:41 Markus Gronlund <markus.gronlund at oracle.com>:
>
> I think it is called.
>
> Remember that main thread has already exited when the shutdown logic is called, the allocations in your test can already have been removed by the GC at this point (marked as dead in the Leak Profiler).
>
> In general, small tests like these are not representative for the Leak Profiler, because it works by acquiring samples over longer periods of time, and then there is the problem of the main thread could already have exited at the point of dump.
>
> Please take a look at the tests located in test/jdk/jdk/jfr/event/oldobject for some reference on how you can take more control to increase the chances of getting samples.
>
> Thanks
> Markus
>
>
>
> -----Original Message-----
> From: Yasumasa Suenaga <yasuenag at gmail.com>
> Sent: den 29 oktober 2018 15:13
> To: Markus Gronlund <markus.gronlund at oracle.com>
> Cc: hotspot-jfr-dev at openjdk.java.net
> Subject: Re: Emergency JFR dump at OOME
>
> Hi Markus,
>
> I tried to get flight record of OOME sample application as below:
> ---------------
> import java.util.*;
>
> public class OOME{
> public static void main(String[] args){
> var list = new ArrayList<byte[]>();
> while(true){
> list.add(new byte[1024]);
> }
> }
> }
> ---------------
>
> Command:
> $ /usr/local/jdk-11.0.1/bin/java -XX:StartFlightRecording=filename=oome.jfr,settings=profile -Xmx256m OOME
>
>
> I could get flight record into oome.jfr, but JMC did not show any objects on "Live Objects" window.
> OOME.java will finish immidiatery. It will not invoke any periodic tasks.
> So I guess LeakProfiler::emit_events() will not be called.
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2018/10/29 22:47, Markus Gronlund wrote:
> > Rotate() and / or stop() is always called as part of shutdown, indirectly by the shutdown thread asking the recorder thread in the VM.
> >
> > LeakProfiler::emit_events() will be called on shutdown if the jdk.OldObjectSample event is enabled.
> >
> > Absence of EventDumpReason implies a normal shutdown.
> >
> > Markus
> >
> >
> > -----Original Message-----
> > From: Yasumasa Suenaga <yasuenag at gmail.com>
> > Sent: den 29 oktober 2018 11:58
> > To: Markus Gronlund <markus.gronlund at oracle.com>
> > Cc: hotspot-jfr-dev at openjdk.java.net
> > Subject: Re: Emergency JFR dump at OOME
> >
> > Hi Markus,
> >
> >> This would screw up the logic for the registered Shutdown hook that will run if the VM is also shutting down (which is not implied).
> >
> > Does it mean JfrRecorderService::rotate() will be called at JfrEmergencyDump::on_vm_shutdown() ?
> > I think EventDumpReason::commit() and LeakProfiler::emit_events() should be called when OOME occurs even if it would not be treated as emergency dump. So we should add them to Universe::gen_out_of_memory_error().
> >
> > What do you think?
> >
> >
> > Thanks,
> >
> > Yasumasa
> >
> >
> > On 2018/10/29 17:28, Markus Gronlund wrote:
> >> Hi Yasumasa,
> >>
> >> I don't think so.
> >>
> >> Handling OOMEs is very intricate, OOMEs are thread local and it is difficult to get it right.
> >>
> >> The suggested patch would do an emergency dump unconditionally on the first reported OOME. This would screw up the logic for the registered Shutdown hook that will run if the VM is also shutting down (which is not implied).
> >> But, it is only the first invocation of report_java_out_of_memory() that is happening here and user code can catch OOME's. Other threads might run fine for quite some time and might not even run into OOME's.
> >> The shutdown hook registered for dumping JFR recordings on VM Exit is set up to attempt to handle graceful shutdown if possible (no OOME), but if it gets an OOME, it will trigger the OOME emergency dump logic.
> >>
> >> Remember that you will need to state you would like a recording dumped out to disk on VM exit for the shutdown hook logic to complete.
> >>
> >> This you can do by:
> >>
> >> -XX:StartFlightRecording:dumponexit=true
> >>
> >> Or by:
> >>
> >> -XX:StartFlightRecording:filename=myrec.jfr
> >>
> >> If the Shutdown hook gets an OOME during the exit logic, it will take the emergency path to create a file called hs_oom_<pid>.jfr.
> >>
> >> There is a current known issue with the fact that OOME's are pre-allocated so they don't turn up in the recordings as Errors (because they are pre-allocated before JFR starts). We might want to add something to Universe::gen_out_of_memory_error() to report this in some way.
> >>
> >> Thanks
> >> Markus
> >>
> >>
> >>
> >> -----Original Message-----
> >> From: Yasumasa Suenaga <yasuenag at gmail.com>
> >> Sent: den 25 oktober 2018 12:58
> >> To: hotspot-jfr-dev at openjdk.java.net
> >> Subject: Emergency JFR dump at OOME
> >>
> >> Hi all,
> >>
> >> According to [1], I guess JFR dumps flight record to file.
> >> But current JFR don't do so.
> >>
> >> Should we fix it as below?
> >> --------------------
> >> diff -r 003c062e16ea src/hotspot/share/utilities/debug.cpp
> >> --- a/src/hotspot/share/utilities/debug.cpp Wed Oct 24 21:17:30 2018 -0700
> >> +++ b/src/hotspot/share/utilities/debug.cpp Thu Oct 25 19:56:54 2018 +0900
> >> @@ -58,6 +58,9 @@
> >> #include "utilities/globalDefinitions.hpp"
> >> #include "utilities/macros.hpp"
> >> #include "utilities/vmError.hpp"
> >> +#if INCLUDE_JFR
> >> +#include "jfr/jfr.hpp"
> >> +#endif
> >>
> >> #include <stdio.h>
> >>
> >> @@ -321,6 +324,8 @@
> >> fatal("OutOfMemory encountered: %s", message);
> >> }
> >>
> >> + JFR_ONLY(Jfr::on_vm_shutdown(false);)
> >> +
> >> if (ExitOnOutOfMemoryError) {
> >> tty->print_cr("Terminating due to java.lang.OutOfMemoryError: %s", message);
> >> os::exit(3);
> >> --------------------
> >>
> >> I will file it to JBS and will send review request if it is verified.
> >>
> >>
> >> Thanks,
> >>
> >> Yasumasa
> >>
> >>
> >> [1] https://hg.openjdk.java.net/jdk/jdk/file/003c062e16ea/src/hotspot/share/jfr/recorder/repository/jfrEmergencyDump.cpp#l159
> >>
More information about the hotspot-jfr-dev
mailing list