PING: Emergency JFR dump at OOME
Erik Gahlin
erik.gahlin at oracle.com
Tue Jan 22 13:43:25 UTC 2019
There are several problems with the patch, it will not combine settings
if there are multiple recordings running at the same time and the
configuration will not show up in JMC. It is also unlikely that someone
will discover the setting in the .jfc file so it will add little benefit
for the community.
I think it is better to add support through the existing mechanism
(dumponexit=true), which can be done when we have proper native support
for dumping recordings [1].
From a user perspective, it is irrelevant if the recording dumps/exits
in Java (in the shutdown hook), or in native (due to OOME).
[1] https://bugs.openjdk.java.net/browse/JDK-8196050
Erik
> Hi Yasumasa,
>
> I will look into this next week.
>
> Thanks
> Erik
>
>> PING: Did you read my email?
>>
>> I believe this enhancement helps us to resolve memory issue.
>>
>>
>> Yasumasa
>>
>>
>> On 2019/01/01 13:16, Yasumasa Suenaga wrote:
>>> Hi,
>>>
>>> I want to discuss about this enhancement (JDK-8213435) again.
>>>
>>> I uploaded my proposal:
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8213435/webrev.00/
>>>
>>> This change provides new option "emitOnOOME" to *.JFC file.
>>> If this options set to true, old object sampling events will be
>>> emitted when OOME occurred.
>>>
>>> This option set to false by default. So it is not affect to Epsilon
>>> user :-)
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2018/11/07 5:29, Erik Gahlin wrote:
>>>> On 2018-11-06 15:39, Erik Gahlin wrote:
>>>>> On 2018-11-01 14:26, Yasumasa Suenaga wrote:
>>>>>> Hi Erik,
>>>>>>
>>>>>> On 2018/11/01 14:13, Erik Gahlin wrote:
>>>>>>> Hi Yasumasa,
>>>>>>>
>>>>>>> Thanks for looking into this, but I don’t think emitting the Old
>>>>>>> Object events is the right thing to do unless we produce a JFR
>>>>>>> file at the same time.
>>>>>>>
>>>>>>> If anything should be added, it should be when the JVM exits.
>>>>>>>
>>>>>>> if (ExitOnOutOfMemoryError) { "
>>>>>>> tty->print_cr("Terminating due to
>>>>>>> java.lang.OutOfMemoryError: %s", message);
>>>>>>> + perhaps call emergency dump jfr here?
>>>>>>> os::exit(3);
>>>>>>> }
>>>>>>>
>>>>>>> That said, thinking this over, I’m not even sure this is a good
>>>>>>> idea.
>>>>>>>
>>>>>>> I don’t know how the ExitOnOutOfMemoryError flag is used in
>>>>>>> production environments. Do people want a .jfr file at that
>>>>>>> point? For all uses cases? Will it irritate people if they need
>>>>>>> to clean up jfr files? Will it make them turn off Flight Recorder?
>>>>>>
>>>>>> IMHO ExitOnOutOfMemoryError (and CrashOnOutOfMemoryError) should
>>>>>> be used in production system because any threads might caught
>>>>>> OOME which causes by another thread.
>>>>>> For example, when request processor on Tomcat consumes a lot of
>>>>>> memory, Tomcat acceptor thread might caught OOME. If so, Tomcat
>>>>>> cannot process any requests in spite of the process is running.
>>>>> Yes, but do you always want a .jfr if this happens?
>>>>>
>>>>> Let's say you are using the Epsilon GC (which sets
>>>>> ExitOnOutOfMemoryError) and have a script that restarts the JVM
>>>>> when it exits. Then your hard disk may fill up with .jfr files.
>>>>>
>>>>> One idea would be to only do it for recordings that have specified
>>>>> -XX:StartFlightRecording:dumponexit=true (which is implicitly set
>>>>> by -XX:StartFlightRecording:filename=<filename>) That way, it
>>>>> would be opt in behavior.
>>>>>
>>>>> This is non-trivial to implement since we can't call Java and
>>>>> allocate objects when we are out of memory. One could perhaps make
>>>>> an up call to Java and there take a previously prepared String
>>>>> representation of the destination path and push that reference
>>>>> back into native, which can then later be inspected. A filename
>>>>> must also be generated if a user hasn't specified one, which is
>>>>> trickier do to in native. Files in the disk repository should also
>>>>> be removed when the JVM exits.
>>>>>
>>>> I filed an enhancement request where this could discussed further.
>>>> https://bugs.openjdk.java.net/browse/JDK-8213435
>>>>
>>>> Erik
>>>>
>>>>>>
>>>>>>
>>>>>>> Emergency (native) dumps should probably be reserved for cases
>>>>>>> when the JVM crashes, similar to the hs_err file.
>>>>>>
>>>>>> I thought JfrEmergencyDump::on_vm_shutdown() should handle all of
>>>>>> OOME, but It seems not to be the case.
>>>>>>
>>>>>>
>>>>>>> I’m reluctant to add a specific flag for the Old Object event
>>>>>>> for the following reasons:
>>>>>>>
>>>>>>> 1) Very few people would find out about the flag, so it would
>>>>>>> add little value in practise. Focus should be to build a product
>>>>>>> that works well out of the box and not tie the implementation to
>>>>>>> a flag that must be respected for the next decade or so.
>>>>>>>
>>>>>>> 2) The feature is new and there is not a good tool for
>>>>>>> visualising old object samples. Once it exists, it’e easier to
>>>>>>> see if emitting events and dumping a recording at this stage
>>>>>>> would help users in real world scenarios.
>>>>>>>
>>>>>>> 3) The way configuration issues have been handled historically
>>>>>>> is using a .jfc file. For instance, one could imagine something
>>>>>>> like this:
>>>>>>>
>>>>>>> <event name=“jdk.OldObjectSample”>
>>>>>>> <setting name=“firstOOME”>true</setting>
>>>>>>> …
>>>>>>> </event>
>>>>>>>
>>>>>>> or perhaps a dedicated event, for example
>>>>>>> “jdk.FirstOOMEObjectSample” (if there really is a use case to
>>>>>>> emit events at this particular time that is not covered by the
>>>>>>> Old Object events emitted when the recording ends). There are
>>>>>>> plans to allow events to be configured on command line, so you
>>>>>>> would not need to create a new jfc file to make a slight change.
>>>>>>> That feature would then automatically provide command line
>>>>>>> capabilities for the Old Object event in the scenario you describe.
>>>>>>>
>>>>>>> To me it seems best to wait with the enhancement for now.
>>>>>>
>>>>>> I agree with you "jdk.FirstOOMEObjectSample" should be added as
>>>>>> event setting.
>>>>>> I don't care it can set whether *.jfc and commandline option.
>>>>>>
>>>>>> Anyway, I think it is very useful if we can get old object
>>>>>> information in flight record when OOME occurs.
>>>>>> Of course we can get heap dump with HeapDumpOnOutOfMemoryError,
>>>>>> but it is not contain time-series data like a flight record.
>>>>>>
>>>>>>
>>>>>> I hope this proposal is accepted JFR team.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>>> Thanks
>>>>>>> Erik
>>>>>>>
>>>>>>>
>>>>>>>> On 1 Nov 2018, at 04:15, Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi Erik,
>>>>>>>>
>>>>>>>>> If a user on the other hand has specified a flag that the JVM
>>>>>>>>> should exit on OOME, then it makes sense to dump a recording
>>>>>>>>> with the old object events and shortest path-to-gc-root.
>>>>>>>>> That part seems to be missing.
>>>>>>>>
>>>>>>>> I tried to add a flag `EmitLeakProfilerEventsOnOOME` to control
>>>>>>>> it as below.
>>>>>>>> It works fine on my environment.
>>>>>>>>
>>>>>>>> ```
>>>>>>>> diff -r 3a8208766f7b src/hotspot/share/jfr/jfr.cpp
>>>>>>>> --- a/src/hotspot/share/jfr/jfr.cpp Thu Nov 01 02:12:13
>>>>>>>> 2018 +0100
>>>>>>>> +++ b/src/hotspot/share/jfr/jfr.cpp Thu Nov 01 12:13:17
>>>>>>>> 2018 +0900
>>>>>>>> @@ -91,6 +91,10 @@
>>>>>>>> return
>>>>>>>> JfrOptionSet::parse_start_flight_recording_option(option,
>>>>>>>> delimiter);
>>>>>>>> }
>>>>>>>>
>>>>>>>> +void Jfr::emit_leak_profiler_events(jlong cutoff_ticks, bool
>>>>>>>> emit_all) {
>>>>>>>> + LeakProfiler::emit_events(cutoff_ticks, emit_all);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>> Thread* Jfr::sampler_thread() {
>>>>>>>> return JfrThreadSampling::sampler_thread();
>>>>>>>> }
>>>>>>>> diff -r 3a8208766f7b src/hotspot/share/jfr/jfr.hpp
>>>>>>>> --- a/src/hotspot/share/jfr/jfr.hpp Thu Nov 01 02:12:13
>>>>>>>> 2018 +0100
>>>>>>>> +++ b/src/hotspot/share/jfr/jfr.hpp Thu Nov 01 12:13:17
>>>>>>>> 2018 +0900
>>>>>>>> @@ -52,6 +52,7 @@
>>>>>>>> static bool on_flight_recorder_option(const JavaVMOption**
>>>>>>>> option, char* delimiter);
>>>>>>>> static bool on_start_flight_recording_option(const
>>>>>>>> JavaVMOption** option, char* delimiter);
>>>>>>>> static void weak_oops_do(BoolObjectClosure* is_alive,
>>>>>>>> OopClosure* f);
>>>>>>>> + static void emit_leak_profiler_events(jlong cutoff_ticks,
>>>>>>>> bool emit_all);
>>>>>>>> static Thread* sampler_thread();
>>>>>>>> };
>>>>>>>>
>>>>>>>> diff -r 3a8208766f7b src/hotspot/share/runtime/globals.hpp
>>>>>>>> --- a/src/hotspot/share/runtime/globals.hpp Thu Nov 01
>>>>>>>> 02:12:13 2018 +0100
>>>>>>>> +++ b/src/hotspot/share/runtime/globals.hpp Thu Nov 01
>>>>>>>> 12:13:17 2018 +0900
>>>>>>>> @@ -2596,6 +2596,9 @@
>>>>>>>> JFR_ONLY(product(ccstr, StartFlightRecording,
>>>>>>>> NULL, \
>>>>>>>> "Start flight recording with
>>>>>>>> options")) \
>>>>>>>> \
>>>>>>>> + JFR_ONLY(product(bool, EmitLeakProfilerEventsOnOOME,
>>>>>>>> false, \
>>>>>>>> + "Emit LeakProfiler events when OutOfMemoryError
>>>>>>>> occurs")) \
>>>>>>>> + \
>>>>>>>> experimental(bool, UseFastUnorderedTimeStamps,
>>>>>>>> false, \
>>>>>>>> "Use platform unstable time where supported for
>>>>>>>> timestamps only")
>>>>>>>>
>>>>>>>> diff -r 3a8208766f7b src/hotspot/share/utilities/debug.cpp
>>>>>>>> --- a/src/hotspot/share/utilities/debug.cpp Thu Nov 01
>>>>>>>> 02:12:13 2018 +0100
>>>>>>>> +++ b/src/hotspot/share/utilities/debug.cpp Thu Nov 01
>>>>>>>> 12:13:17 2018 +0900
>>>>>>>> @@ -58,6 +58,9 @@
>>>>>>>> #include "utilities/globalDefinitions.hpp"
>>>>>>>> #include "utilities/macros.hpp"
>>>>>>>> #include "utilities/vmError.hpp"
>>>>>>>> +#if INCLUDE_JFR
>>>>>>>> +#include "jfr/jfr.hpp"
>>>>>>>> +#endif
>>>>>>>>
>>>>>>>> #include <stdio.h>
>>>>>>>>
>>>>>>>> @@ -306,6 +309,13 @@
>>>>>>>> // commands multiple times we just do it once when the first
>>>>>>>> threads reports
>>>>>>>> // the error.
>>>>>>>> if (Atomic::cmpxchg(1, &out_of_memory_reported, 0) == 0) {
>>>>>>>> +
>>>>>>>> +#if INCLUDE_JFR
>>>>>>>> + if (EmitLeakProfilerEventsOnOOME) {
>>>>>>>> + Jfr::emit_leak_profiler_events(max_jlong, false);
>>>>>>>> + }
>>>>>>>> +#endif
>>>>>>>> +
>>>>>>>> // create heap dump before OnOutOfMemoryError commands are
>>>>>>>> executed
>>>>>>>> if (HeapDumpOnOutOfMemoryError) {
>>>>>>>> tty->print_cr("java.lang.OutOfMemoryError: %s", message);
>>>>>>>> ```
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Yasumasa
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2018/11/01 8:47, Erik Gahlin wrote:
>>>>>>>>>> On 31 Oct 2018, at 03:58, Yasumasa Suenaga
>>>>>>>>>> <yasuenag at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> 2018年10月31日(水) 0:35 Markus Gronlund
>>>>>>>>>> <markus.gronlund at oracle.com>:
>>>>>>>>>>>
>>>>>>>>>>> I think I provided you with the wrong settings.
>>>>>>>>>>>
>>>>>>>>>>> Please change:
>>>>>>>>>>>
>>>>>>>>>>> JFR_ONLY(Jfr::emit_leak_profiler_events(0, true);)
>>>>>>>>>>>
>>>>>>>>>>> To
>>>>>>>>>>>
>>>>>>>>>>> JFR_ONLY(Jfr::emit_leak_profiler_events(max_jlong, false);)
>>>>>>>>>>>
>>>>>>>>>>> I think this will get you the GC roots as well.
>>>>>>>>>>
>>>>>>>>>> Thanks! It works fine.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> We need to think about if / how this should be integrated.
>>>>>>>>>>> If so, it might be that it needs to be guarded behind some
>>>>>>>>>>> flag to not always issue a full safepoint, root scanning and
>>>>>>>>>>> edge traversals on every OOME.
>>>>>>>>>>
>>>>>>>>>> Do you mean VM operation should be added for
>>>>>>>>>> Jfr::emit_leak_profiler_events()?
>>>>>>>>>> I think it can do so because HeapDumper is called from this
>>>>>>>>>> function.
>>>>>>>>> When a recording ends, old object samples are written. They
>>>>>>>>> have the most up to date information about what is leaking. I
>>>>>>>>> don’t think we should emit old object events before that
>>>>>>>>> happens. It will make recordings harder to analyze and
>>>>>>>>> introduce an intrusive safepoint.
>>>>>>>>> If a user on the other hand has specified a flag that the JVM
>>>>>>>>> should exit on OOME, then it makes sense to dump a recording
>>>>>>>>> with the old object events and shortest path-to-gc-root.
>>>>>>>>> That part seems to be missing.
>>>>>>>>> Erik
>>>>>>>>>>
>>>>>>>>>> Anyway, I want you to merge this change to JFR. :-)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yasumasa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Markus
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>> Sent: den 30 oktober 2018 14:54
>>>>>>>>>>> To: Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>> Cc: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>> Subject: Re: Emergency JFR dump at OOME
>>>>>>>>>>>
>>>>>>>>>>> Thanks Markus!
>>>>>>>>>>> It works fine on my environment.
>>>>>>>>>>>
>>>>>>>>>>> Could you apply this change to JFR?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Yasumasa
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> P.S.
>>>>>>>>>>> I got flight record with path-to-gc-roots=true, however
>>>>>>>>>>> "GC Root" in JMC
>>>>>>>>>>> is null. Is it correct?
>>>>>>>>>>> (Application is OOME.java which I shared before)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2018/10/30 21:01, Markus Gronlund wrote:
>>>>>>>>>>>> Hi again,
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe you can try something like this:
>>>>>>>>>>>>
>>>>>>>>>>>> # HG changeset patch
>>>>>>>>>>>> # User mgronlun
>>>>>>>>>>>> # Date 1540900357 -3600
>>>>>>>>>>>> # Tue Oct 30 12:52:37 2018 +0100
>>>>>>>>>>>> # Node ID 32a48c323970c5fc4d0d1ffff5860a3c55c4a4dc
>>>>>>>>>>>> # Parent 80d104390dd2821fd95d56981bf9d37f1cc2e363
>>>>>>>>>>>> [mq]: yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/src/hotspot/share/jfr/jfr.cpp
>>>>>>>>>>>> b/src/hotspot/share/jfr/jfr.cpp
>>>>>>>>>>>> --- a/src/hotspot/share/jfr/jfr.cpp
>>>>>>>>>>>> +++ b/src/hotspot/share/jfr/jfr.cpp
>>>>>>>>>>>> @@ -91,6 +91,10 @@
>>>>>>>>>>>> return
>>>>>>>>>>>> JfrOptionSet::parse_start_flight_recording_option(option,
>>>>>>>>>>>> delimiter);
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> +void Jfr::emit_leak_profiler_events(jlong cutoff_ticks, bool
>>>>>>>>>>>> +emit_all) {
>>>>>>>>>>>> + LeakProfiler::emit_events(cutoff_ticks, emit_all); }
>>>>>>>>>>>> +
>>>>>>>>>>>> Thread* Jfr::sampler_thread() {
>>>>>>>>>>>> return JfrThreadSampling::sampler_thread();
>>>>>>>>>>>> }
>>>>>>>>>>>> diff --git a/src/hotspot/share/jfr/jfr.hpp
>>>>>>>>>>>> b/src/hotspot/share/jfr/jfr.hpp
>>>>>>>>>>>> --- a/src/hotspot/share/jfr/jfr.hpp
>>>>>>>>>>>> +++ b/src/hotspot/share/jfr/jfr.hpp
>>>>>>>>>>>> @@ -52,6 +52,7 @@
>>>>>>>>>>>> static bool on_flight_recorder_option(const
>>>>>>>>>>>> JavaVMOption** option, char* delimiter);
>>>>>>>>>>>> static bool on_start_flight_recording_option(const
>>>>>>>>>>>> JavaVMOption** option, char* delimiter);
>>>>>>>>>>>> static void weak_oops_do(BoolObjectClosure* is_alive,
>>>>>>>>>>>> OopClosure*
>>>>>>>>>>>> f);
>>>>>>>>>>>> + static void emit_leak_profiler_events(jlong
>>>>>>>>>>>> cutoff_ticks, bool
>>>>>>>>>>>> + emit_all);
>>>>>>>>>>>> static Thread* sampler_thread();
>>>>>>>>>>>> };
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>>> b/src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>>> --- a/src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>>> +++ b/src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>>> @@ -58,6 +58,9 @@
>>>>>>>>>>>> #include "utilities/globalDefinitions.hpp"
>>>>>>>>>>>> #include "utilities/macros.hpp"
>>>>>>>>>>>> #include "utilities/vmError.hpp"
>>>>>>>>>>>> +#if INCLUDE_JFR
>>>>>>>>>>>> +#include "jfr/jfr.hpp"
>>>>>>>>>>>> +#endif
>>>>>>>>>>>>
>>>>>>>>>>>> #include <stdio.h>
>>>>>>>>>>>>
>>>>>>>>>>>> @@ -306,6 +309,8 @@
>>>>>>>>>>>> // commands multiple times we just do it once when the
>>>>>>>>>>>> first threads reports
>>>>>>>>>>>> // the error.
>>>>>>>>>>>> if (Atomic::cmpxchg(1, &out_of_memory_reported, 0) == 0) {
>>>>>>>>>>>> + JFR_ONLY(Jfr::emit_leak_profiler_events(0, true);)
>>>>>>>>>>>> +
>>>>>>>>>>>> // create heap dump before OnOutOfMemoryError
>>>>>>>>>>>> commands are executed
>>>>>>>>>>>> if (HeapDumpOnOutOfMemoryError) {
>>>>>>>>>>>> tty->print_cr("java.lang.OutOfMemoryError: %s", message);
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This should write the contents of the leak profiler at the
>>>>>>>>>>>> first reported OOME; it will go through the regular chunk
>>>>>>>>>>>> writing mechanism in order that it do not destroy existing
>>>>>>>>>>>> dump logic.
>>>>>>>>>>>>
>>>>>>>>>>>> Let me know if it works for you.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> Markus
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>>> Sent: den 30 oktober 2018 02:35
>>>>>>>>>>>> To: Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>>> Cc: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>> Subject: Re: Emergency JFR dump at OOME
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Markus,
>>>>>>>>>>>>
>>>>>>>>>>>> I confirmed with GDB that Leak Profiler is called by
>>>>>>>>>>>> shutdown hook.
>>>>>>>>>>>> I think it is very useful to obtain information when OOME
>>>>>>>>>>>> occurs because the user might not get heap dump.
>>>>>>>>>>>>
>>>>>>>>>>>> So I want JFR to call Leak Profiler when OOME occurs before
>>>>>>>>>>>> being destroyed problematic thread.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>
>>>>>>>>>>>> 2018年10月29日(月) 23:41 Markus Gronlund
>>>>>>>>>>>> <markus.gronlund at oracle.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think it is called.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Remember that main thread has already exited when the
>>>>>>>>>>>>> shutdown logic is called, the allocations in your test can
>>>>>>>>>>>>> already have been removed by the GC at this point (marked
>>>>>>>>>>>>> as dead in the Leak Profiler).
>>>>>>>>>>>>>
>>>>>>>>>>>>> In general, small tests like these are not representative
>>>>>>>>>>>>> for the Leak Profiler, because it works by acquiring
>>>>>>>>>>>>> samples over longer periods of time, and then there is the
>>>>>>>>>>>>> problem of the main thread could already have exited at
>>>>>>>>>>>>> the point of dump.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please take a look at the tests located in
>>>>>>>>>>>>> test/jdk/jdk/jfr/event/oldobject for some reference on how
>>>>>>>>>>>>> you can take more control to increase the chances of
>>>>>>>>>>>>> getting samples.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> Markus
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>>>> Sent: den 29 oktober 2018 15:13
>>>>>>>>>>>>> To: Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>>>> Cc: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>> Subject: Re: Emergency JFR dump at OOME
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Markus,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I tried to get flight record of OOME sample application as
>>>>>>>>>>>>> below:
>>>>>>>>>>>>> ---------------
>>>>>>>>>>>>> import java.util.*;
>>>>>>>>>>>>>
>>>>>>>>>>>>> public class OOME{
>>>>>>>>>>>>> public static void main(String[] args){
>>>>>>>>>>>>> var list = new ArrayList<byte[]>();
>>>>>>>>>>>>> while(true){
>>>>>>>>>>>>> list.add(new byte[1024]);
>>>>>>>>>>>>> }
>>>>>>>>>>>>> }
>>>>>>>>>>>>> }
>>>>>>>>>>>>> ---------------
>>>>>>>>>>>>>
>>>>>>>>>>>>> Command:
>>>>>>>>>>>>> $ /usr/local/jdk-11.0.1/bin/java
>>>>>>>>>>>>> -XX:StartFlightRecording=filename=oome.jfr,settings=profile -Xmx256m
>>>>>>>>>>>>>
>>>>>>>>>>>>> OOME
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I could get flight record into oome.jfr, but JMC did not
>>>>>>>>>>>>> show any objects on "Live Objects" window.
>>>>>>>>>>>>> OOME.java will finish immidiatery. It will not invoke any
>>>>>>>>>>>>> periodic tasks.
>>>>>>>>>>>>> So I guess LeakProfiler::emit_events() will not be called.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2018/10/29 22:47, Markus Gronlund wrote:
>>>>>>>>>>>>>> Rotate() and / or stop() is always called as part of
>>>>>>>>>>>>>> shutdown, indirectly by the shutdown thread asking the
>>>>>>>>>>>>>> recorder thread in the VM.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> LeakProfiler::emit_events() will be called on shutdown if
>>>>>>>>>>>>>> the jdk.OldObjectSample event is enabled.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Absence of EventDumpReason implies a normal shutdown.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Markus
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>>>>> Sent: den 29 oktober 2018 11:58
>>>>>>>>>>>>>> To: Markus Gronlund <markus.gronlund at oracle.com>
>>>>>>>>>>>>>> Cc: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>>> Subject: Re: Emergency JFR dump at OOME
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Markus,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This would screw up the logic for the registered
>>>>>>>>>>>>>>> Shutdown hook that will run if the VM is also shutting
>>>>>>>>>>>>>>> down (which is not implied).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Does it mean JfrRecorderService::rotate() will be called
>>>>>>>>>>>>>> at JfrEmergencyDump::on_vm_shutdown() ?
>>>>>>>>>>>>>> I think EventDumpReason::commit() and
>>>>>>>>>>>>>> LeakProfiler::emit_events() should be called when OOME
>>>>>>>>>>>>>> occurs even if it would not be treated as emergency dump.
>>>>>>>>>>>>>> So we should add them to
>>>>>>>>>>>>>> Universe::gen_out_of_memory_error().
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What do you think?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2018/10/29 17:28, Markus Gronlund wrote:
>>>>>>>>>>>>>>> Hi Yasumasa,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I don't think so.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Handling OOMEs is very intricate, OOMEs are thread local
>>>>>>>>>>>>>>> and it is difficult to get it right.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The suggested patch would do an emergency dump
>>>>>>>>>>>>>>> unconditionally on the first reported OOME. This would
>>>>>>>>>>>>>>> screw up the logic for the registered Shutdown hook that
>>>>>>>>>>>>>>> will run if the VM is also shutting down (which is not
>>>>>>>>>>>>>>> implied).
>>>>>>>>>>>>>>> But, it is only the first invocation of
>>>>>>>>>>>>>>> report_java_out_of_memory() that is happening here and
>>>>>>>>>>>>>>> user code can catch OOME's. Other threads might run fine
>>>>>>>>>>>>>>> for quite some time and might not even run into OOME's.
>>>>>>>>>>>>>>> The shutdown hook registered for dumping JFR recordings
>>>>>>>>>>>>>>> on VM Exit is set up to attempt to handle graceful
>>>>>>>>>>>>>>> shutdown if possible (no OOME), but if it gets an OOME,
>>>>>>>>>>>>>>> it will trigger the OOME emergency dump logic.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Remember that you will need to state you would like a
>>>>>>>>>>>>>>> recording dumped out to disk on VM exit for the shutdown
>>>>>>>>>>>>>>> hook logic to complete.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This you can do by:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -XX:StartFlightRecording:dumponexit=true
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Or by:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -XX:StartFlightRecording:filename=myrec.jfr
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If the Shutdown hook gets an OOME during the exit logic,
>>>>>>>>>>>>>>> it will take the emergency path to create a file called
>>>>>>>>>>>>>>> hs_oom_<pid>.jfr.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There is a current known issue with the fact that OOME's
>>>>>>>>>>>>>>> are pre-allocated so they don't turn up in the
>>>>>>>>>>>>>>> recordings as Errors (because they are pre-allocated
>>>>>>>>>>>>>>> before JFR starts). We might want to add something to
>>>>>>>>>>>>>>> Universe::gen_out_of_memory_error() to report this in
>>>>>>>>>>>>>>> some way.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> Markus
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>>>> From: Yasumasa Suenaga <yasuenag at gmail.com>
>>>>>>>>>>>>>>> Sent: den 25 oktober 2018 12:58
>>>>>>>>>>>>>>> To: hotspot-jfr-dev at openjdk.java.net
>>>>>>>>>>>>>>> Subject: Emergency JFR dump at OOME
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> According to [1], I guess JFR dumps flight record to file.
>>>>>>>>>>>>>>> But current JFR don't do so.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Should we fix it as below?
>>>>>>>>>>>>>>> --------------------
>>>>>>>>>>>>>>> diff -r 003c062e16ea src/hotspot/share/utilities/debug.cpp
>>>>>>>>>>>>>>> --- a/src/hotspot/share/utilities/debug.cpp Wed Oct 24
>>>>>>>>>>>>>>> 21:17:30 2018 -0700
>>>>>>>>>>>>>>> +++ b/src/hotspot/share/utilities/debug.cpp Thu Oct 25
>>>>>>>>>>>>>>> 19:56:54 2018 +0900
>>>>>>>>>>>>>>> @@ -58,6 +58,9 @@
>>>>>>>>>>>>>>> #include "utilities/globalDefinitions.hpp"
>>>>>>>>>>>>>>> #include "utilities/macros.hpp"
>>>>>>>>>>>>>>> #include "utilities/vmError.hpp"
>>>>>>>>>>>>>>> +#if INCLUDE_JFR
>>>>>>>>>>>>>>> +#include "jfr/jfr.hpp"
>>>>>>>>>>>>>>> +#endif
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> #include <stdio.h>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @@ -321,6 +324,8 @@
>>>>>>>>>>>>>>> fatal("OutOfMemory encountered: %s", message);
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> + JFR_ONLY(Jfr::on_vm_shutdown(false);)
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> if (ExitOnOutOfMemoryError) {
>>>>>>>>>>>>>>> tty->print_cr("Terminating due to
>>>>>>>>>>>>>>> java.lang.OutOfMemoryError: %s", message);
>>>>>>>>>>>>>>> os::exit(3);
>>>>>>>>>>>>>>> --------------------
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I will file it to JBS and will send review request if it
>>>>>>>>>>>>>>> is verified.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yasumasa
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>> https://hg.openjdk.java.net/jdk/jdk/file/003c062e16ea/src/hotspot/s
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> hare/jfr/recorder/repository/jfrEmergencyDump.cpp#l159
>>>>>>>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>
More information about the hotspot-jfr-dev
mailing list