RFR (S): 8007779: os::die() on solaris should generate core file

Mikael Vidstedt mikael.vidstedt at oracle.com
Fri Feb 8 08:04:54 PST 2013


On 2013-02-08 07:12, Mikael Gerdin wrote:
> On 2013-02-08 15:05, Staffan Larsen wrote:
>> The return code from the process seems to be 134 (after an 
>> experiment). This would be the same as after a successful printing of 
>> hs_err when we do manage to create a core dump.
>
> When a posix process is terminated by an uncaught fatal signal the 
> exit code is usually 128 + SIGNAL
> Since SIGABRT == 6 you got 134

I believe the 128+n may be for bash specifically, not for general posix 
processes, but the same conclusion goes.

/Another Mikael

>
> /Mikael
>
>>
>> /Staffan
>>
>> On 8 feb 2013, at 14:54, David Holmes <david.holmes at oracle.com> wrote:
>>
>>> My other email hasn't turned up yet but I was confusing this with 
>>> the change that added the dump_core flag to os::abort.
>>>
>>> It's only by "accident" that we use ::abort on linux - _exit didn't 
>>> work back in the old days of LinuxThreads :)
>>>
>>> This seems like a simple and potentially useful change, but I have a 
>>> feeling it may have some unexpected consequences somewhere. :)
>>>
>>> Actually one possible consequence - what return code will the 
>>> process issue if it now hits this? Could this impact testing and 
>>> failure matching ?
>>>
>>> David
>>>
>>> On 8/02/2013 10:24 PM, Staffan Larsen wrote:
>>>> This is a request for review of a small change to the crash 
>>>> reporting on solaris.
>>>>
>>>> When hotspot crashes during the writing of the hs_err file, we call 
>>>> os::die(). On linux and bsd this causes a core file to be written 
>>>> (by calling ::abort()). This is good since we then have some record 
>>>> of what went wrong. On solaris, we call _exit() and no core file is 
>>>> created.
>>>>
>>>> There are two cases during the hs_err writing where we call 
>>>> os::die(). First, if the writing hangs, the WatcherThread will call 
>>>> os::die(). Second, if we get too many errors during the writing we 
>>>> will call os::die(). In both these cases it would be very helpful 
>>>> to have a core file. Otherwise all you have to go on is something 
>>>> like this:
>>>>
>>>> # A fatal error has been detected by the Java Runtime Environment:
>>>> #
>>>> # SIGSEGV (0xb) at pc=0xffffffff653848c0, pid=11823, tid=240
>>>> #
>>>> # JRE version: Java(TM) SE Runtime Environment (7.0_12-b11)
>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.0-b24 mixed mode 
>>>> solaris-sparc compressed oops)
>>>> # Problematic frame:
>>>> # C [libc.so.1+0x848c0]# [ timer expired, abort... ]
>>>>
>>>> Below is the change I would like to do.
>>>>
>>>> Thanks,
>>>> /Staffan
>>>>
>>>>
>>>> diff --git a/src/os/solaris/vm/os_solaris.cpp 
>>>> b/src/os/solaris/vm/os_solaris.cpp
>>>> --- a/src/os/solaris/vm/os_solaris.cpp
>>>> +++ b/src/os/solaris/vm/os_solaris.cpp
>>>> @@ -1865,7 +1865,7 @@
>>>>
>>>>   // Die immediately, no exit hook, no abort hook, no cleanup.
>>>>   void os::die() {
>>>> -  _exit(-1);
>>>> +  ::abort(); // dump core (for debugging)
>>>>   }
>>>>
>>>>
>>



More information about the hotspot-dev mailing list