RFR(s): 8065895: Synchronous signals during error reporting may terminate or hang VM process
Thomas Stüfe
thomas.stuefe at gmail.com
Thu Jan 29 13:47:34 UTC 2015
Hi David,
thank you. Here is the new patch:
http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.06/webrev/
Just corrected the typos and removed trailing whitespaces.
Kind Regards, Thomas
On Thu, Jan 29, 2015 at 8:47 AM, David Holmes <david.holmes at oracle.com>
wrote:
> Hi Thomas,
>
> Sorry for the delay in getting back to this. Looks good - only a few typos:
>
> debug.hpp:
>
> + // 15 - SIGILL
>
> should be SIGFPE.
>
> vmError_<os>.cpp:
>
> + // Note that in the hotspot
>
> Delete 'the'.
>
> + // hang or be ended immediately if secondary errors happend during error
>
> s/terminated/ended/
>
> s/happen/happend/
>
> Thanks,
> David
>
> On 26/01/2015 8:23 PM, Thomas Stüfe wrote:
>
>> Anyone? I would really like to close this issue.
>>
>> Kind Regards,
>>
>> Thomas Stüfe
>>
>> On Wed, Jan 21, 2015 at 10:39 AM, Thomas Stüfe <thomas.stuefe at gmail.com
>> <mailto:thomas.stuefe at gmail.com>> wrote:
>>
>> Hi all,
>>
>> I would like to take up discussion about this issue again.
>>
>> As a reminder, here the bug report:
>> https://bugs.openjdk.java.net/browse/JDK-8065895
>>
>> And here a new version of my patch:
>> http://cr.openjdk.java.net/~stuefe/webrevs/8065895/webrev.05/webrev/
>>
>> As far as I remember the discussion, the issue itself was understood
>> and the fix itself met with approval (I think), but the point of
>> contention was the code I added to reproduce the error for
>> regression tests.
>>
>> In order to reproduce the error, I need two different synchronous
>> signals to happen, one in normal code, one in the error handler
>> which writes the hs-err file. Originally I choose SIGSEGV and
>> SIGILL. I added functions to generate (true, real) SIGSEGVs and
>> SIGILLs. But nobody liked my generate-sigill function, therefore I
>> changed the code to generate a SIGFPE instead. For the test, it does
>> not matter which signals are generated as long as they are
>> synchronous (so, one of SEGV,ILL,BUS,FPE).
>>
>> In order to check that the fix works, one can do:
>>
>> java -XX:ErrorHandlerTest=15 -XX:TestCrashInErrorHandler=14
>>
>> The VM will first crash with a SIGSEGV, enter error handling, crash
>> again with a different synchronous signal (FPE, in this case). An
>> unfixed VM will die immediately and we get a torn hs-err file. A
>> fixed VM will show the "error occurred during error handling" string
>> and continue with the error reporting.
>>
>> Kind Regards,
>>
>> Thomas Stüfe
>>
>>
>>
>> On Mon, Dec 8, 2014 at 11:37 AM, Thomas Stüfe
>> <thomas.stuefe at gmail.com <mailto:thomas.stuefe at gmail.com>> wrote:
>>
>> Hi David, Dean,
>>
>> On Fri, Dec 5, 2014 at 8:05 AM, David Holmes
>> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>>
>> On 3/12/2014 8:47 PM, Thomas Stüfe wrote:
>>
>> Hi Dean,
>>
>> I dont understand. Such a function does not exist, does
>> it? So I would
>> have to write it:
>>
>> Do you mean generating and using a StubRoutine which
>> would SIGILL? I did
>> not do this because I wanted to be able to generate
>> SIGILL also in
>> initialization code, where StubRoutines may not yet be
>> generated. This
>> point may may be arguable, but as this function is used
>> to test error
>> handling, it may be interesting to test it for
>> half-initialized VMs too.
>>
>> Otherwise I would implement the CPU specific
>> generate_illegal_instruction_____sequence() probably the
>>
>> same way as I do
>> now the crash_with_sigill() function. That would mean a
>> bit of more code
>> duplication because:
>> - Either I use the method I use now (reserve_memory and
>> copy the
>> instructions to the reserved page)
>> - Or I use inline assembly - which probably does not
>> work across
>> multiple OSs, so for CPUs which span various OSs I would
>> have to add one
>> function per os_cpu combination, not just per cpu.
>>
>>
>> I don't think there is any OS dependency with inline
>> assembly - only compiler. And I am also concerned that
>> writing code to an executable page will also enter the realm
>> of "self-modifying code" and all the jumping through hoops
>> that entails. That aspect hadn't occurred to me till Dean
>> raised it. I'm forming the view that triggering a SIGILL is
>> more effort than it is worth for a secondary testing function.
>>
>>
>> Well, the code is used and works in our VM since some years on a
>> number of CPUs, so the problem with the flushing do not occur at
>> least in our cases. But I agree with you, and this seems to be a
>> point of contention and it is really too unimportant to stop the
>> whole patch.
>>
>> The whole point of using SIGILL was to have another unblockable
>> signal besides SIGSEGV to occur naturally (without raising) to
>> be able to demonstrate the bug before fixing it. I will now
>> attempt to change the patch to use either SIGFPE or SIGBUS as a
>> secondary signal. Maybe generating those signals with pure C/C++
>> is easier. If that does not work out, I will see what I can do
>> with raise().
>>
>> Thanks and Kind regards, Thomas
>>
>>
>>
>>
>>
More information about the hotspot-runtime-dev
mailing list