Emergency JFR dump at OOME

Markus Gronlund markus.gronlund at oracle.com
Mon Oct 29 14:41:50 UTC 2018


I think it is called.

Remember that main thread has already exited when the shutdown logic is called, the allocations in your test can already have been removed by the GC at this point (marked as dead in the Leak Profiler).

In general, small tests like these are not representative for the Leak Profiler, because it works by acquiring samples over longer periods of time, and then there is the problem of the main thread could already have exited at the point of dump.

Please take a look at the tests located in test/jdk/jdk/jfr/event/oldobject for some reference on how you can take more control to increase the chances of getting samples. 

Thanks
Markus



-----Original Message-----
From: Yasumasa Suenaga <yasuenag at gmail.com> 
Sent: den 29 oktober 2018 15:13
To: Markus Gronlund <markus.gronlund at oracle.com>
Cc: hotspot-jfr-dev at openjdk.java.net
Subject: Re: Emergency JFR dump at OOME

Hi Markus,

I tried to get flight record of OOME sample application as below:
---------------
import java.util.*;

public class OOME{
   public static void main(String[] args){
     var list = new ArrayList<byte[]>();
     while(true){
       list.add(new byte[1024]);
     }
   }
}
---------------

Command:
   $ /usr/local/jdk-11.0.1/bin/java -XX:StartFlightRecording=filename=oome.jfr,settings=profile -Xmx256m OOME


I could get flight record into oome.jfr, but JMC did not show any objects on "Live Objects" window.
OOME.java will finish immidiatery. It will not invoke any periodic tasks.
So I guess LeakProfiler::emit_events() will not be called.


Thanks,

Yasumasa


On 2018/10/29 22:47, Markus Gronlund wrote:
> Rotate() and / or stop() is always called as part of shutdown, indirectly by the shutdown thread asking the recorder thread in the VM.
> 
> LeakProfiler::emit_events() will be called on shutdown if the jdk.OldObjectSample event is enabled.
> 
> Absence of EventDumpReason implies a normal shutdown.
> 
> Markus
> 
> 
> -----Original Message-----
> From: Yasumasa Suenaga <yasuenag at gmail.com>
> Sent: den 29 oktober 2018 11:58
> To: Markus Gronlund <markus.gronlund at oracle.com>
> Cc: hotspot-jfr-dev at openjdk.java.net
> Subject: Re: Emergency JFR dump at OOME
> 
> Hi Markus,
> 
>> This would screw up the logic for the registered Shutdown hook that will run if the VM is also shutting down (which is not implied).
> 
> Does it mean JfrRecorderService::rotate() will be called at JfrEmergencyDump::on_vm_shutdown() ?
> I think EventDumpReason::commit() and  LeakProfiler::emit_events() should be called when OOME occurs even if it would not be treated as emergency dump. So we should add them to Universe::gen_out_of_memory_error().
> 
> What do you think?
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2018/10/29 17:28, Markus Gronlund wrote:
>> Hi Yasumasa,
>>
>> I don't think so.
>>
>> Handling OOMEs is very intricate, OOMEs are thread local and it is difficult to get it right.
>>
>> The suggested patch would do an emergency dump unconditionally on the first reported OOME. This would screw up the logic for the registered Shutdown hook that will run if the VM is also shutting down (which is not implied).
>> But, it is only the first invocation of report_java_out_of_memory() that is happening here and user code can catch OOME's. Other threads might run fine for quite some time and might not even run into OOME's.
>> The shutdown hook registered for dumping JFR recordings on VM Exit is set up to attempt to handle graceful shutdown if possible (no OOME), but if it gets an OOME, it will trigger the OOME emergency dump logic.
>>
>> Remember that you will need to state you would like a recording dumped out to disk on VM exit for the shutdown hook logic to complete.
>>
>> This you can do by:
>>
>> -XX:StartFlightRecording:dumponexit=true
>>
>> Or by:
>>
>> -XX:StartFlightRecording:filename=myrec.jfr
>>
>> If the Shutdown hook gets an OOME during the exit logic, it will take the emergency path to create a file called hs_oom_<pid>.jfr.
>>
>> There is a current known issue with the fact that OOME's are pre-allocated so they don't turn up in the recordings as Errors (because they are pre-allocated before JFR starts). We might want to add something to Universe::gen_out_of_memory_error() to report this in some way.
>>
>> Thanks
>> Markus
>>
>>
>>
>> -----Original Message-----
>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>> Sent: den 25 oktober 2018 12:58
>> To: hotspot-jfr-dev at openjdk.java.net
>> Subject: Emergency JFR dump at OOME
>>
>> Hi all,
>>
>> According to [1], I guess JFR dumps flight record to file.
>> But current JFR don't do so.
>>
>> Should we fix it as below?
>> --------------------
>> diff -r 003c062e16ea src/hotspot/share/utilities/debug.cpp
>> --- a/src/hotspot/share/utilities/debug.cpp     Wed Oct 24 21:17:30 2018 -0700
>> +++ b/src/hotspot/share/utilities/debug.cpp     Thu Oct 25 19:56:54 2018 +0900
>> @@ -58,6 +58,9 @@
>>     #include "utilities/globalDefinitions.hpp"
>>     #include "utilities/macros.hpp"
>>     #include "utilities/vmError.hpp"
>> +#if INCLUDE_JFR
>> +#include "jfr/jfr.hpp"
>> +#endif
>>
>>     #include <stdio.h>
>>
>> @@ -321,6 +324,8 @@
>>           fatal("OutOfMemory encountered: %s", message);
>>         }
>>
>> +    JFR_ONLY(Jfr::on_vm_shutdown(false);)
>> +
>>         if (ExitOnOutOfMemoryError) {
>>           tty->print_cr("Terminating due to java.lang.OutOfMemoryError: %s", message);
>>           os::exit(3);
>> --------------------
>>
>> I will file it to JBS and will send review request if it is verified.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> [1] https://hg.openjdk.java.net/jdk/jdk/file/003c062e16ea/src/hotspot/share/jfr/recorder/repository/jfrEmergencyDump.cpp#l159
>>


More information about the hotspot-jfr-dev mailing list