RFR(xs): 8227031: Print NMT statistics on fatal errors

Doerr, Martin martin.doerr at sap.com
Wed Jul 10 12:40:01 UTC 2019


Hi Thomas,

looks good to me.

vmError.cpp:
// Print out NMT statistics if this was desired.

What is the value of this comment?
I suggest to remove it if we don't have anything important to explain.

Best regards,
Martin


> -----Original Message-----
> From: hotspot-runtime-dev <hotspot-runtime-dev-
> bounces at openjdk.java.net> On Behalf Of Thomas Stüfe
> Sent: Dienstag, 9. Juli 2019 09:32
> To: Baesken, Matthias <matthias.baesken at sap.com>
> Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
> Subject: Re: RFR(xs): 8227031: Print NMT statistics on fatal errors
> 
> Hi Matthias,
> 
> 
> On Tue, Jul 9, 2019 at 9:15 AM Baesken, Matthias
> <matthias.baesken at sap.com>
> wrote:
> 
> > Hi Thomas,   In wonder  about  the following :
> >
> > MemTracker::final_report   is called also from  print_statistics()  :
> >
> > hotspot/share/runtime/java.cpp
> > -----------------------------------------------------
> > void print_statistics() {
> > ...
> > 353  // Native memory tracking data
> > 354  if (PrintNMTStatistics) {
> > 355    MemTracker::final_report(tty);
> > 356  }
> >
> >
> > Would this mean  that when  called  before  from  print_statistics() , we
> > would not call it again  from vmError because of the
> > g_final_report_did_run  check ?
> >
> > src/hotspot/share/services/memTracker.cpp
> > -----------------------------------------------
> >
> > 179 static volatile bool g_final_report_did_run = false;
> > 180 void MemTracker::final_report(outputStream* output) {
> > 181   // This function is called during both error reporting and normal VM
> > exit.
> > 182   // However, it should only ever run once.  E.g. if the VM crashes
> > after
> > 183   // printing the final report during normal VM exit, it should not
> > print
> > 184   // the final report again. In addition, it should be guarded from
> > 185   // recursive calls in case NMT reporting itself crashes.
> > 186   if (Atomic::cmpxchg(true, &g_final_report_did_run, false) == false) {
> > 187     NMT_TrackingLevel level = tracking_level();
> > 188     if (level >= NMT_summary) {
> > 189       report(level == NMT_summary, output);
> > 190     }
> > 191   }
> > 192 }
> >
> > Is  this really what we want ?   Of course we want to avoid printing it
> > twice (or more than that ) from error reporting.
> > But I think we would miss it from error reporting in some situations when
> > we want it there .
> >
> >
> This is exactly what I wanted:
> 
> MemTracker::final_report(tty) is supposed to print the final report. It is
> called in two places, during normal VM shutdown (A) and during error
> handling (B). By only allowing the code to run once I get the behaviour I
> wanted:
> 
> Case 1: normal shutdown: we execute (A) from before_exit(), all is well.
> Case 2: we crash before normal shutdown: we execute (B) from within the
> error handler.
> Case 3: we crash during normal shutdown: we executed already (A) and
> hence
> (B) is a noop
> Case 4: crash within (!) MemTracker::final_report():
>     Case 4.1: MemTracker::final_report() was called during normal shutdown
> and crashed (A):  - we enter error handling, but we will not re-enter NMT
> reporting at (B) which is good since NMT reporting is not reentrant.
>     Case 4.2: MemTracker::final_report() was called during error handling
> and crashed (B):  - we enter the secondary signal handler, restart
> VMError::report_and_die(), but will not attempt to print NMT report again
> 
> Especially Case 4 is important, since it can lead to hanging error
> reporting since NMT is not reentrant and will suffocate on its own lock.
> 
> Arguably, Case 2 and 3 are "just" aesthetics and prevent seeing the same
> report twice.
> 
> 
> 
> > Otherwise looks okay to me .
> >
> >
> Thanks!
> 
> 
> > Best regards, Matthias
> >
> >
> Cheers, Thomas
> 
> >
> >
> > >Hi all,
> > >
> > >We have -XX:+-PrintNMTStatistics, a very useful switch which will cause
> > the
> > >VM to print out the NMT statistics if the VM exits normally.
> > >
> > >Currently it does not work if the VM exits due to a fatal error. But
> > >especially in fatal exits due to native OOM a NMT report would be very
> > >helpful.
> > >
> > >JBS: https://bugs.openjdk.java.net/browse/JDK-8227031
> > >
> > >cr:
> > >
> > http://cr.openjdk.java.net/~stuefe/webrevs/8227031-optionally-print-
> nmt-report-on-oom/webrev.00/webrev/index.html
> > >
> > >Changes in this patch:
> > >- handle PrintNMTStatistics on fatal error
> > >- make sure the final report is not called twice accidentally and it is
> > not
> > >called recursively due to secondary error handling
> > >- change the Metaspace report portion of the NMT report to only include
> > the
> > >brief metaspace report - that one can be called at any time, it does not
> > >lock nor require any resources.
> > >
> > >Please note: this will not work when we are in an OOM situation and
> > request
> > >a detailed NMT report; that scenario needs more work since NMT
> detailed
> > >reports need memory as well. That is a separate issue.
> >
> >
> >
> >


More information about the hotspot-runtime-dev mailing list