PING: Emergency JFR dump at OOME
Erik Gahlin
erik.gahlin at oracle.com
Thu Jan 10 13:08:18 UTC 2019
Hi Yasumasa,
I will look into this next week.
Thanks
Erik
> PING: Did you read my email?
>
> I believe this enhancement helps us to resolve memory issue.
>
>
> Yasumasa
>
>
> On 2019/01/01 13:16, Yasumasa Suenaga wrote:
>> Hi,
>>
>> I want to discuss about this enhancement (JDK-8213435) again.
>>
>> I uploaded my proposal:
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8213435/webrev.00/
>>
>> This change provides new option "emitOnOOME" to *.JFC file.
>> If this options set to true, old object sampling events will be
>> emitted when OOME occurred.
>>
>> This option set to false by default. So it is not affect to Epsilon
>> user :-)
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2018/11/07 5:29, Erik Gahlin wrote:
>>> On 2018-11-06 15:39, Erik Gahlin wrote:
>>>> On 2018-11-01 14:26, Yasumasa Suenaga wrote:
>>>>> Hi Erik,
>>>>>
>>>>> On 2018/11/01 14:13, Erik Gahlin wrote:
>>>>>> Hi Yasumasa,
>>>>>>
>>>>>> Thanks for looking into this, but I don’t think emitting the Old
>>>>>> Object events is the right thing to do unless we produce a JFR
>>>>>> file at the same time.
>>>>>>
>>>>>> If anything should be added, it should be when the JVM exits.
>>>>>>
>>>>>> if (ExitOnOutOfMemoryError) { "
>>>>>> tty->print_cr("Terminating due to
>>>>>> java.lang.OutOfMemoryError: %s", message);
>>>>>> + perhaps call emergency dump jfr here?
>>>>>> os::exit(3);
>>>>>> }
>>>>>>
>>>>>> That said, thinking this over, I’m not even sure this is a good
>>>>>> idea.
>>>>>>
>>>>>> I don’t know how the ExitOnOutOfMemoryError flag is used in
>>>>>> production environments. Do people want a .jfr file at that
>>>>>> point? For all uses cases? Will it irritate people if they need
>>>>>> to clean up jfr files? Will it make them turn off Flight Recorder?
>>>>>
>>>>> IMHO ExitOnOutOfMemoryError (and CrashOnOutOfMemoryError) should
>>>>> be used in production system because any threads might caught OOME
>>>>> which causes by another thread.
>>>>> For example, when request processor on Tomcat consumes a lot of
>>>>> memory, Tomcat acceptor thread might caught OOME. If so, Tomcat
>>>>> cannot process any requests in spite of the process is running.
>>>> Yes, but do you always want a .jfr if this happens?
>>>>
>>>> Let's say you are using the Epsilon GC (which sets
>>>> ExitOnOutOfMemoryError) and have a script that restarts the JVM
>>>> when it exits. Then your hard disk may fill up with .jfr files.
>>>>
>>>> One idea would be to only do it for recordings that have specified
>>>> -XX:StartFlightRecording:dumponexit=true (which is implicitly set
>>>> by -XX:StartFlightRecording:filename=<filename>) That way, it would
>>>> be opt in behavior.
>>>>
>>>> This is non-trivial to implement since we can't call Java and
>>>> allocate objects when we are out of memory. One could perhaps make
>>>> an up call to Java and there take a previously prepared String
>>>> representation of the destination path and push that reference back
>>>> into native, which can then later be inspected. A filename must
>>>> also be generated if a user hasn't specified one, which is trickier
>>>> do to in native. Files in the disk repository should also be
>>>> removed when the JVM exits.
>>>>
>>> I filed an enhancement request where this could discussed further.
>>> https://bugs.openjdk.java.net/browse/JDK-8213435
>>>
>>> Erik
>>>
>>>>>
>>>>>
>>>>>> Emergency (native) dumps should probably be reserved for cases
>>>>>> when the JVM crashes, similar to the hs_err file.
>>>>>
>>>>> I thought JfrEmergencyDump::on_vm_shutdown() should handle all of
>>>>> OOME, but It seems not to be the case.
>>>>>
>>>>>
>>>>>> I’m reluctant to add a specific flag for the Old Object event for
>>>>>> the following reasons:
>>>>>>
>>>>>> 1) Very few people would find out about the flag, so it would add
>>>>>> little value in practise. Focus should be to build a product that
>>>>>> works well out of the box and not tie the implementation to a
>>>>>> flag that must be respected for the next decade or so.
>>>>>>
>>>>>> 2) The feature is new and there is not a good tool for
>>>>>> visualising old object samples. Once it exists, it’e easier to
>>>>>> see if emitting events and dumping a recording at this stage
>>>>>> would help users in real world scenarios.
>>>>>>
>>>>>> 3) The way configuration issues have been handled historically is
>>>>>> using a .jfc file. For instance, one could imagine something like
>>>>>> this:
>>>>>>
>>>>>> <event name=“jdk.OldObjectSample”>
>>>>>> <setting name=“firstOOME”>true</setting>
>>>>>> …
>>>>>> </event>
>>>>>>
>>>>>> or perhaps a dedicated event, for example
>>>>>> “jdk.FirstOOMEObjectSample” (if there really is a use case to
>>>>>> emit events at this particular time that is not covered by the
>>>>>> Old Object events emitted when the recording ends). There are
>>>>>> plans to allow events to be configured on command line, so you
>>>>>> would not need to create a new jfc file to make a slight change.
>>>>>> That feature would then automatically provide command line
>>>>>> capabilities for the Old Object event in the scenario you describe.
>>>>>>
>>>>>> To me it seems best to wait with the enhancement for now.
>>>>>
>>>>> I agree with you "jdk.FirstOOMEObjectSample" should be added as
>>>>> event setting.
>>>>> I don't care it can set whether *.jfc and commandline option.
>>>>>
>>>>> Anyway, I think it is very useful if we can get old object
>>>>> information in flight record when OOME occurs.
>>>>> Of course we can get heap dump with HeapDumpOnOutOfMemoryError,
>>>>> but it is not contain time-series data like a flight record.
>>>>>
>>>>>
>>>>> I hope this proposal is accepted JFR team.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>>> Thanks
>>>>>> Erik
>>>>>>
>>>>>>
>>>>>>> On 1 Nov 2018, at 04:15, Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi Erik,
>>>>>>>
>>>>>>>> If a user on the other hand has specified a flag that the JVM
>>>>>>>> should exit on OOME, then it makes sense to dump a recording
>>>>>>>> with the old object events and shortest path-to-gc-root.
>>>>>>>> That part seems to be missing.
>>>>>>>
>>>>>>> I tried to add a flag `EmitLeakProfilerEventsOnOOME` to control
>>>>>>> it as below.
>>>>>>> It works fine on my environment.
>>>>>>>
>>>>>>> ```
>>>>>>> diff -r 3a8208766f7b src/hotspot/share/jfr/jfr.cpp
>>>>>>> --- a/src/hotspot/share/jfr/jfr.cpp Thu Nov 01 02:12:13 2018
>>>>>>> +0100
>>>>>>> +++ b/src/hotspot/share/jfr/jfr.cpp Thu Nov 01 12:13:17 2018
>>>>>>> +0900
>>>>>>> @@ -91,6 +91,10 @@
>>>>>>> return
>>>>>>> JfrOptionSet::parse_start_flight_recording_option(option,
>>>>>>> delimiter);
>>>>>>> }
>>>>>>>
>>>>>>> +void Jfr::emit_leak_profiler_events(jlong cutoff_ticks, bool
>>>>>>> emit_all) {
>>>>>>> + LeakProfiler::emit_events(cutoff_ticks, emit_all);
>>>>>>> +}
>>>>>>> +
>>>>>>> Thread* Jfr::sampler_thread() {
>>>>>>> return JfrThreadSampling::sampler_thread();
>>>>>>> }
>>>>>>> diff -r 3a8208766f7b src/hotspot/share/jfr/jfr.hpp
>>>>>>> --- a/src/hotspot/share/jfr/jfr.hpp Thu Nov 01 02:12:13 2018
>>>>>>> +0100
>>>>>>> +++ b/src/hotspot/share/jfr/jfr.hpp Thu Nov 01 12:13:17 2018
>>>>>>> +0900
>>>>>>> @@ -52,6 +52,7 @@
>>>>>>> static bool on_flight_recorder_option(const JavaVMOption**
>>>>>>> option, char* delimiter);
>>>>>>> static bool on_start_flight_recording_option(const
>>>>>>> JavaVMOption** option, char* delimiter);
>>>>>>> static void weak_oops_do(BoolObjectClosure* is_alive,
>>>>>>> OopClosure* f);
>>>>>>> + static void emit_leak_profiler_events(jlong cutoff_ticks,
>>>>>>> bool emit_all);
>>>>>>> static Thread* sampler_thread();
>>>>>>> };
>>>>>>>
>>>>>>> diff -r 3a8208766f7b src/hotspot/share/runtime/globals.hpp
>>>>>>> --- a/src/hotspot/share/runtime/globals.hpp Thu Nov 01
>>>>>>> 02:12:13 2018 +0100
>>>>>>> +++ b/src/hotspot/share/runtime/globals.hpp Thu Nov 01
>>>>>>> 12:13:17 2018 +0900
>>>>>>> @@ -2596,6 +2596,9 @@
>>>>>>> JFR_ONLY(product(ccstr, StartFlightRecording,
>>>>>>> NULL, \
>>>>>>> "Start flight recording with
>>>>>>> options")) \
>>>>>>> \
>>>>>>> + JFR_ONLY(product(bool, EmitLeakProfilerEventsOnOOME,
>>>>>>> false, \
>>>>>>> + "Emit LeakProfiler events when OutOfMemoryError
>>>>>>> occurs")) \
>>>>>>> + \
>>>>>>> experimental(bool, UseFastUnorderedTimeStamps,
>>>>>>> false, \
>>>>>>> "Use platform unstable time where supported for
>>>>>>> timestamps only")
>>>>>>>
>>>>>>> diff -r 3a8208766f7b src/hotspot/share/utilities/debug.cpp
>>>>>>> --- a/src/hotspot/share/utilities/debug.cpp Thu Nov 01
>>>>>>> 02:12:13 2018 +0100
>>>>>>> +++ b/src/hotspot/share/utilities/debug.cpp Thu Nov 01
>>>>>>> 12:13:17 2018 +0900
>>>>>>> @@ -58,6 +58,9 @@
>>>>>>> #include "utilities/globalDefinitions.hpp"
>>>>>>> #include "utilities/macros.hpp"
>>>>>>> #include "utilities/vmError.hpp"
>>>>>>> +#if INCLUDE_JFR
>>>>>>> +#include "jfr/jfr.hpp"
>>>>>>> +#endif
>>>>>>>
>>>>>>> #include <stdio.h>
>>>>>>>
>>>>>>> @@ -306,6 +309,13 @@
>>>>>>> // commands multiple times we just do it once when the first
>>>>>>> threads reports
>>>>>>> // the error.
>>>>>>> if (Atomic::cmpxchg(1, &out_of_memory_reported, 0) == 0) {
>>>>>>> +
>>>>>>> +#if INCLUDE_JFR
>>>>>>> + if (EmitLeakProfilerEventsOnOOME) {
>>>>>>> + Jfr::emit_leak_profiler_events(max_jlong, false);
>>>>>>> + }
>>>>>>> +#endif
>>>>>>> +
>>>>>>> // create heap dump before OnOutOfMemoryError commands are
>>>>>>> executed
>>>>>>> if (HeapDumpOnOutOfMemoryError) {
>>>>>>> tty->print_cr("java.lang.OutOfMemoryError: %s", message);
>>>>>>> ```
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>>>>>>
>>>>>>> On 2018/11/01 8:47, Erik Gahlin wrote:
>>>>>>>>> On 31 Oct 2018, at 03:58, Yasumasa Suenaga
>>>>>>>>> <yasuenag at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> 2018年10月31日(水) 0:35 Markus Gronlund
>>>>>>>>> <markus.gronlund at oracle.com>:
>>>>>>>>>>
>>>>>>>>>> I think I provided you with the wrong settings.
>>>>>>>>>>
>>>>>>>>>> Please change:
>>>>>>>>>>
>>>>>>>>>> JFR_ONLY(Jfr::emit_leak_profiler_events(0, true);)
>>>>>>>>>>
>>>>>>>>>> To
>>>>>>>>>>
>>>>>>>>>> JFR_ONLY(Jfr::emit_leak_profiler_events(max_jlong, false);)
>>>>>>>>>>
>>>>>>>>>> I think this will get you the GC roots as well.
>>>>>>>>>
>>>>>>>>> Thanks! It works fine.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> We need to think about if / how this should be integrated. If
>>>>>>>>>> so, it might be that it needs to be guarded behind some flag
>>>>>>>>>> to not always issue a full safepoint, root scanning and edge
>>>>>>>>>> traversals on every OOME.
>>>>>>>>>
>>>>>>>>> Do you mean VM operation should be added for
>>>>>>>>> Jfr::emit_leak_profiler_events()?
>>>>>>>>> I think it can do so because HeapDumper is called from this
>>>>>>>>> function.
>>>>>>>> When a recording ends, old object samples are written. They
>>>>>>>> have the most up to date information about what is leaking. I
>>>>>>>> don’t think we should emit old object events before that
>>>>>>>> happens. It will make recordings harder to analyze and
>>>>>>>> introduce an intrusive safepoint.
>>>>>>>> If a user on the other hand has specified a flag that the JVM
>>>>>>>> should exit on OOME, then it makes sense to dump a recording
>>>>>>>> with the old object events and shortest path-to-gc-root.
>>>>>>>> That part seems to be missing.
>>>>>>>> Erik
>>>>>>>>>
>>>>>>>>> Anyway, I want you to merge this change to JFR. :-)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yasumasa
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Markus
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>> Sent: den 30 oktober 2018 14:54
>>>>>>>>>> To: Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>> Cc: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>> Subject: Re: Emergency JFR dump at OOME
>>>>>>>>>>
>>>>>>>>>> Thanks Markus!
>>>>>>>>>> It works fine on my environment.
>>>>>>>>>>
>>>>>>>>>> Could you apply this change to JFR?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> P.S.
>>>>>>>>>> I got flight record with path-to-gc-roots=true, however
>>>>>>>>>> "GC Root" in JMC
>>>>>>>>>> is null. Is it correct?
>>>>>>>>>> (Application is OOME.java which I shared before)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2018/10/30 21:01, Markus Gronlund wrote:
>>>>>>>>>>> Hi again,
>>>>>>>>>>>
>>>>>>>>>>> Maybe you can try something like this:
>>>>>>>>>>>
>>>>>>>>>>> # HG changeset patch
>>>>>>>>>>> # User mgronlun
>>>>>>>>>>> # Date 1540900357 -3600
>>>>>>>>>>> # Tue Oct 30 12:52:37 2018 +0100
>>>>>>>>>>> # Node ID 32a48c323970c5fc4d0d1ffff5860a3c55c4a4dc
>>>>>>>>>>> # Parent 80d104390dd2821fd95d56981bf9d37f1cc2e363
>>>>>>>>>>> [mq]: yasumasa
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/src/hotspot/share/jfr/jfr.cpp
>>>>>>>>>>> b/src/hotspot/share/jfr/jfr.cpp
>>>>>>>>>>> --- a/src/hotspot/share/jfr/jfr.cpp
>>>>>>>>>>> +++ b/src/hotspot/share/jfr/jfr.cpp
>>>>>>>>>>> @@ -91,6 +91,10 @@
>>>>>>>>>>> return
>>>>>>>>>>> JfrOptionSet::parse_start_flight_recording_option(option,
>>>>>>>>>>> delimiter);
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> +void Jfr::emit_leak_profiler_events(jlong cutoff_ticks, bool
>>>>>>>>>>> +emit_all) {
>>>>>>>>>>> + LeakProfiler::emit_events(cutoff_ticks, emit_all); }
>>>>>>>>>>> +
>>>>>>>>>>> Thread* Jfr::sampler_thread() {
>>>>>>>>>>> return JfrThreadSampling::sampler_thread();
>>>>>>>>>>> }
>>>>>>>>>>> diff --git a/src/hotspot/share/jfr/jfr.hpp
>>>>>>>>>>> b/src/hotspot/share/jfr/jfr.hpp
>>>>>>>>>>> --- a/src/hotspot/share/jfr/jfr.hpp
>>>>>>>>>>> +++ b/src/hotspot/share/jfr/jfr.hpp
>>>>>>>>>>> @@ -52,6 +52,7 @@
>>>>>>>>>>> static bool on_flight_recorder_option(const
>>>>>>>>>>> JavaVMOption** option, char* delimiter);
>>>>>>>>>>> static bool on_start_flight_recording_option(const
>>>>>>>>>>> JavaVMOption** option, char* delimiter);
>>>>>>>>>>> static void weak_oops_do(BoolObjectClosure* is_alive,
>>>>>>>>>>> OopClosure*
>>>>>>>>>>> f);
>>>>>>>>>>> + static void emit_leak_profiler_events(jlong cutoff_ticks,
>>>>>>>>>>> bool
>>>>>>>>>>> + emit_all);
>>>>>>>>>>> static Thread* sampler_thread();
>>>>>>>>>>> };
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>> b/src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>> --- a/src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>> +++ b/src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>> @@ -58,6 +58,9 @@
>>>>>>>>>>> #include "utilities/globalDefinitions.hpp"
>>>>>>>>>>> #include "utilities/macros.hpp"
>>>>>>>>>>> #include "utilities/vmError.hpp"
>>>>>>>>>>> +#if INCLUDE_JFR
>>>>>>>>>>> +#include "jfr/jfr.hpp"
>>>>>>>>>>> +#endif
>>>>>>>>>>>
>>>>>>>>>>> #include <stdio.h>
>>>>>>>>>>>
>>>>>>>>>>> @@ -306,6 +309,8 @@
>>>>>>>>>>> // commands multiple times we just do it once when the
>>>>>>>>>>> first threads reports
>>>>>>>>>>> // the error.
>>>>>>>>>>> if (Atomic::cmpxchg(1, &out_of_memory_reported, 0) == 0) {
>>>>>>>>>>> + JFR_ONLY(Jfr::emit_leak_profiler_events(0, true);)
>>>>>>>>>>> +
>>>>>>>>>>> // create heap dump before OnOutOfMemoryError commands
>>>>>>>>>>> are executed
>>>>>>>>>>> if (HeapDumpOnOutOfMemoryError) {
>>>>>>>>>>> tty->print_cr("java.lang.OutOfMemoryError: %s", message);
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This should write the contents of the leak profiler at the
>>>>>>>>>>> first reported OOME; it will go through the regular chunk
>>>>>>>>>>> writing mechanism in order that it do not destroy existing
>>>>>>>>>>> dump logic.
>>>>>>>>>>>
>>>>>>>>>>> Let me know if it works for you.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Markus
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>> Sent: den 30 oktober 2018 02:35
>>>>>>>>>>> To: Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>> Cc: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>> Subject: Re: Emergency JFR dump at OOME
>>>>>>>>>>>
>>>>>>>>>>> Hi Markus,
>>>>>>>>>>>
>>>>>>>>>>> I confirmed with GDB that Leak Profiler is called by
>>>>>>>>>>> shutdown hook.
>>>>>>>>>>> I think it is very useful to obtain information when OOME
>>>>>>>>>>> occurs because the user might not get heap dump.
>>>>>>>>>>>
>>>>>>>>>>> So I want JFR to call Leak Profiler when OOME occurs before
>>>>>>>>>>> being destroyed problematic thread.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>> 2018年10月29日(月) 23:41 Markus Gronlund
>>>>>>>>>>> <markus.gronlund at oracle.com>:
>>>>>>>>>>>>
>>>>>>>>>>>> I think it is called.
>>>>>>>>>>>>
>>>>>>>>>>>> Remember that main thread has already exited when the
>>>>>>>>>>>> shutdown logic is called, the allocations in your test can
>>>>>>>>>>>> already have been removed by the GC at this point (marked
>>>>>>>>>>>> as dead in the Leak Profiler).
>>>>>>>>>>>>
>>>>>>>>>>>> In general, small tests like these are not representative
>>>>>>>>>>>> for the Leak Profiler, because it works by acquiring
>>>>>>>>>>>> samples over longer periods of time, and then there is the
>>>>>>>>>>>> problem of the main thread could already have exited at the
>>>>>>>>>>>> point of dump.
>>>>>>>>>>>>
>>>>>>>>>>>> Please take a look at the tests located in
>>>>>>>>>>>> test/jdk/jdk/jfr/event/oldobject for some reference on how
>>>>>>>>>>>> you can take more control to increase the chances of
>>>>>>>>>>>> getting samples.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Markus
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>>> Sent: den 29 oktober 2018 15:13
>>>>>>>>>>>> To: Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>>> Cc: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>> Subject: Re: Emergency JFR dump at OOME
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Markus,
>>>>>>>>>>>>
>>>>>>>>>>>> I tried to get flight record of OOME sample application as
>>>>>>>>>>>> below:
>>>>>>>>>>>> ---------------
>>>>>>>>>>>> import java.util.*;
>>>>>>>>>>>>
>>>>>>>>>>>> public class OOME{
>>>>>>>>>>>> public static void main(String[] args){
>>>>>>>>>>>> var list = new ArrayList<byte[]>();
>>>>>>>>>>>> while(true){
>>>>>>>>>>>> list.add(new byte[1024]);
>>>>>>>>>>>> }
>>>>>>>>>>>> }
>>>>>>>>>>>> }
>>>>>>>>>>>> ---------------
>>>>>>>>>>>>
>>>>>>>>>>>> Command:
>>>>>>>>>>>> $ /usr/local/jdk-11.0.1/bin/java
>>>>>>>>>>>> -XX:StartFlightRecording=filename=oome.jfr,settings=profile
>>>>>>>>>>>> -Xmx256m
>>>>>>>>>>>> OOME
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I could get flight record into oome.jfr, but JMC did not
>>>>>>>>>>>> show any objects on "Live Objects" window.
>>>>>>>>>>>> OOME.java will finish immidiatery. It will not invoke any
>>>>>>>>>>>> periodic tasks.
>>>>>>>>>>>> So I guess LeakProfiler::emit_events() will not be called.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2018/10/29 22:47, Markus Gronlund wrote:
>>>>>>>>>>>>> Rotate() and / or stop() is always called as part of
>>>>>>>>>>>>> shutdown, indirectly by the shutdown thread asking the
>>>>>>>>>>>>> recorder thread in the VM.
>>>>>>>>>>>>>
>>>>>>>>>>>>> LeakProfiler::emit_events() will be called on shutdown if
>>>>>>>>>>>>> the jdk.OldObjectSample event is enabled.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Absence of EventDumpReason implies a normal shutdown.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Markus
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>>>> Sent: den 29 oktober 2018 11:58
>>>>>>>>>>>>> To: Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>>>> Cc: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>> Subject: Re: Emergency JFR dump at OOME
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Markus,
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This would screw up the logic for the registered Shutdown
>>>>>>>>>>>>>> hook that will run if the VM is also shutting down (which
>>>>>>>>>>>>>> is not implied).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does it mean JfrRecorderService::rotate() will be called
>>>>>>>>>>>>> at JfrEmergencyDump::on_vm_shutdown() ?
>>>>>>>>>>>>> I think EventDumpReason::commit() and
>>>>>>>>>>>>> LeakProfiler::emit_events() should be called when OOME
>>>>>>>>>>>>> occurs even if it would not be treated as emergency dump.
>>>>>>>>>>>>> So we should add them to Universe::gen_out_of_memory_error().
>>>>>>>>>>>>>
>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2018/10/29 17:28, Markus Gronlund wrote:
>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't think so.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Handling OOMEs is very intricate, OOMEs are thread local
>>>>>>>>>>>>>> and it is difficult to get it right.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The suggested patch would do an emergency dump
>>>>>>>>>>>>>> unconditionally on the first reported OOME. This would
>>>>>>>>>>>>>> screw up the logic for the registered Shutdown hook that
>>>>>>>>>>>>>> will run if the VM is also shutting down (which is not
>>>>>>>>>>>>>> implied).
>>>>>>>>>>>>>> But, it is only the first invocation of
>>>>>>>>>>>>>> report_java_out_of_memory() that is happening here and
>>>>>>>>>>>>>> user code can catch OOME's. Other threads might run fine
>>>>>>>>>>>>>> for quite some time and might not even run into OOME's.
>>>>>>>>>>>>>> The shutdown hook registered for dumping JFR recordings
>>>>>>>>>>>>>> on VM Exit is set up to attempt to handle graceful
>>>>>>>>>>>>>> shutdown if possible (no OOME), but if it gets an OOME,
>>>>>>>>>>>>>> it will trigger the OOME emergency dump logic.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Remember that you will need to state you would like a
>>>>>>>>>>>>>> recording dumped out to disk on VM exit for the shutdown
>>>>>>>>>>>>>> hook logic to complete.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This you can do by:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -XX:StartFlightRecording:dumponexit=true
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Or by:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -XX:StartFlightRecording:filename=myrec.jfr
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If the Shutdown hook gets an OOME during the exit logic,
>>>>>>>>>>>>>> it will take the emergency path to create a file called
>>>>>>>>>>>>>> hs_oom_<pid>.jfr.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There is a current known issue with the fact that OOME's
>>>>>>>>>>>>>> are pre-allocated so they don't turn up in the recordings
>>>>>>>>>>>>>> as Errors (because they are pre-allocated before JFR
>>>>>>>>>>>>>> starts). We might want to add something to
>>>>>>>>>>>>>> Universe::gen_out_of_memory_error() to report this in
>>>>>>>>>>>>>> some way.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> Markus
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>>>>> Sent: den 25 oktober 2018 12:58
>>>>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>>> Subject: Emergency JFR dump at OOME
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> According to [1], I guess JFR dumps flight record to file.
>>>>>>>>>>>>>> But current JFR don't do so.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Should we fix it as below?
>>>>>>>>>>>>>> --------------------
>>>>>>>>>>>>>> diff -r 003c062e16ea src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>>>>> --- a/src/hotspot/share/utilities/debug.cpp Wed Oct 24
>>>>>>>>>>>>>> 21:17:30 2018 -0700
>>>>>>>>>>>>>> +++ b/src/hotspot/share/utilities/debug.cpp Thu Oct 25
>>>>>>>>>>>>>> 19:56:54 2018 +0900
>>>>>>>>>>>>>> @@ -58,6 +58,9 @@
>>>>>>>>>>>>>> #include "utilities/globalDefinitions.hpp"
>>>>>>>>>>>>>> #include "utilities/macros.hpp"
>>>>>>>>>>>>>> #include "utilities/vmError.hpp"
>>>>>>>>>>>>>> +#if INCLUDE_JFR
>>>>>>>>>>>>>> +#include "jfr/jfr.hpp"
>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> #include <stdio.h>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> @@ -321,6 +324,8 @@
>>>>>>>>>>>>>> fatal("OutOfMemory encountered: %s", message);
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> + JFR_ONLY(Jfr::on_vm_shutdown(false);)
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> if (ExitOnOutOfMemoryError) {
>>>>>>>>>>>>>> tty->print_cr("Terminating due to
>>>>>>>>>>>>>> java.lang.OutOfMemoryError: %s", message);
>>>>>>>>>>>>>> os::exit(3);
>>>>>>>>>>>>>> --------------------
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I will file it to JBS and will send review request if it
>>>>>>>>>>>>>> is verified.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://hg.openjdk.java.net/jdk/jdk/file/003c062e16ea/src/hotspot/s
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> hare/jfr/recorder/repository/jfrEmergencyDump.cpp#l159
>>>>>>>>>>>>>>
>>>>>>
>>>>
>>>
More information about the hotspot-jfr-dev
mailing list