RFR (S): 8237777 "Dumping core ..." is shown despite claiming that "# No core dump will be written."

David Holmes david.holmes at oracle.com
Fri Apr 24 00:55:19 UTC 2020


Hi Gerard,

<trimming>

On 24/04/2020 1:31 am, gerard ziemski wrote:
> Thank you David for more feedback.
> 
> The current algorithm here is:
> 
> #1 check core path
> 
> #2   if no core path, then print "no core might exist", returns TRUE
> 
> #3   if LINUX && special core path, print "core my be processed", 
> returns TRUE (!!!!)

Yes apologies for getting this part wrong previously. If there is a 
special core path we assume a core dump will be produced. Yet if the 
special core path is invalid (e.g. refers to uninstalled program) then 
it seems we may be subject to the rlimit and so there won't actually be 
a core file. And conversely as Yasumasa points out if the special core 
path is valid we may get a core file regardless of what rlimit is set to!

So ...

> 
> #4   else check rlimit

perhaps this should not be an "else"? I think the checking of the core 
pattern could be used to print the current message regarding "may be 
processed" but it should not control whether the method returns 
true/false. Instead we then move on to the rlimit checks, which can 
print an additional message, and also return true/false. That then 
determine whether we pass true/false to os::abort.

But the problem is only with what gets printed not whether we call 
::abort versus ::exit(1)! We shouldn't be using the rlimit result to 
change that!

This seems like we just swap one bug for another. :(

Given that the "Dumping core ..." is only shown in non-product builds I 
think we are way over-thinking/engineering this. We have:

   if (dump_core) {
   ...
#ifndef PRODUCT
     ...
     out.print_raw_cr("Dumping core ...");
#endif
     ::abort(); // dump core
   }

so lets just change one line:

     out.print_raw_cr("Requesting a core dump if possible ...");

otherwise you need one boolean to control whether we call abort vs exit, 
and a second boolean to control what we print.

Or change that one line to:

     out.print_raw_cr("Calling ::abort() ...");

Thanks,
David
-----

> #5     if rlimit returned error, print "core might not exist", returns TRUE
> 
> #6     if rlimit==infinite, print "core file location", returns TRUE
> 
> #7     if rlimit==0, print "no core file", returns FALSE
> 
> #8     if rlimit<infinite, warn about "ensuring full core dump", returns 
> TRUE
> 
> We can either get #2, #3, #5, #6, #7 or #8, but no combinations of any 
> of them.
> 
> The issue here is that as long as #3 succeeds we don't even get to #7, 
> which is the key here - the fix is in fact incomplete unless we test for 
> rlimit.
> 
> 
> I think the algorithm we need should be something like:
> 
> #1 check rlimit
> 
> #2   if rlimit returned error, print "core might not exist", returns TRUE
> 
> #3   if rlimit==0, print "no core & set rlimit" warning, returns FALSE
> 
> #4   if rlimit<infinite add warning about "ensuring full core dump" 
> (does NOT return yet)
> 
> #5 check core path
> 
> #6   if no core path, print "no core might exist", returns TRUE
> 
> #7   if NOT LINUX, print "core file location", returns TRUE
> 
> #8   if LINUX, print print "core my be processed", returns TRUE
> 
> 
> The main point here is that rlimit==0 check must be handled early 
> enough, so that we don't end up saying we will dump the core, when we 
> might be unable to do so, regardless of what other extra mechanism say 
> about the pattern of the core file name.
> 
> Before I go off and code this, however, I'd like a consensus.
> 
> 
> cheers
> 


More information about the hotspot-runtime-dev mailing list