(jdk10) RFR(xxs): 8176872: [s390] wrong pc shown in error logs
Thomas Stüfe
thomas.stuefe at gmail.com
Tue Apr 4 13:06:46 UTC 2017
On Tue, Apr 4, 2017 at 2:54 PM, David Holmes <david.holmes at oracle.com>
wrote:
> Sorry Thomas but I'm seeing a test failure on linux-x64 in the
> runtime/ErrorHandling/SafeFetchInErrorHandlingTest
>
> Seems safefetch is not working:
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> Will test SafeFetch...
>
> [error occurred during error reporting (test safefetch in error handler),
> id 0xb]
>
> #
> # SIGSEGV (0xb) at pc=0x00007f078dd58582, pid=603, tid=604
>
>
> I don't see this with a JDK9 build. Can you test it?
>
>
Oops. Yes, sure!
.. Thomas
> I have to call it a night now.
>
> David
>
>
> On 4/04/2017 10:20 PM, David Holmes wrote:
>
>> Hi Thomas,
>>
>> Yes I can sponsor this.
>>
>> Cheers,
>> David
>>
>> On 4/04/2017 10:16 PM, Thomas Stüfe wrote:
>>
>>>
>>>
>>> On Tue, Apr 4, 2017 at 12:04 PM, David Holmes <david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>> wrote:
>>>
>>> On 4/04/2017 5:32 PM, Thomas Stüfe wrote:
>>>
>>> Hi David,
>>>
>>> On Tue, Apr 4, 2017 at 9:25 AM, David Holmes
>>> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>>> <mailto:david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>>> wrote:
>>>
>>> Hi Thomas,
>>>
>>> On 4/04/2017 3:54 AM, Thomas Stüfe wrote:
>>>
>>> Hi Dmitry, David,
>>>
>>> sorry for dropping the ball on this one. Here the latest
>>> version - I
>>> only fixed the cosmetics Dmitry requested:
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wron
>>> g-pc-in-errorlogs/jdk10-webrev.02/webrev/
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.02/webrev/>
>>>
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.02/webrev/
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.02/webrev/>>
>>>
>>>
>>>
>>> Looks okay. I'm a little unsure about only fixing this in
>>> the shared
>>> code for non-s390. But I guess we can deal with other
>>> platforms later.
>>>
>>>
>>> Not sure what you mean, but a common solution for all posix
>>> platforms
>>> would be even easier. Instead of fixing the s390 signal handler
>>> and the
>>> secondary signal handler for all posix platforms, one could
>>> adjust the
>>> pc at the entry of error reporting, in
>>> VMError::report_and_die(). But
>>> that would require an #ifndef WIN32 and you guys frown upon
>>> ifdefs, so
>>> that is why I avoided this solution.
>>>
>>>
>>> Well there are frowns and there are frowns :) I don't like that the
>>> other platforms don't use si_addr too - seems the "right" way to get
>>> the pc and what would be done if writing the signal handler today.
>>> But then I also don't understand why handle_linux_signal is
>>> duplicated across CPU specific files! Maybe there are some subtle
>>> differences but it should be pretty OS specific rather than CPU
>>> specific.
>>>
>>>
>>> I agree. Would be a good subject for cleanup.
>>>
>>>
>>>
>>> Anyway ...
>>>
>>> But lets commit this solution, and if other platforms should
>>> ever have
>>> the same problem, we can modify the code.
>>>
>>>
>>> ... lets proceed as is.
>>>
>>>
>>> Thanks! Could you please sponsor this?
>>>
>>> Thanks, Thomas
>>>
>>>
>>>
>>> Thanks,
>>> David
>>>
>>> ..Thomas
>>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>> Thanks for reviewing!
>>>
>>> Kind Regards, Thomas
>>>
>>> On Sun, Mar 26, 2017 at 12:22 PM, Dmitry Samersoff
>>> <dmitry.samersoff at oracle.com
>>> <mailto:dmitry.samersoff at oracle.com>
>>> <mailto:dmitry.samersoff at oracle.com
>>> <mailto:dmitry.samersoff at oracle.com>>
>>> <mailto:dmitry.samersoff at oracle.com
>>> <mailto:dmitry.samersoff at oracle.com>
>>>
>>> <mailto:dmitry.samersoff at oracle.com
>>> <mailto:dmitry.samersoff at oracle.com>>>> wrote:
>>>
>>> Thomas,
>>>
>>> Looks good to me,
>>>
>>> We may consider to always use info->si_addr.
>>>
>>> Nits:
>>>
>>> vmError_posix.cpp
>>>
>>> 118:
>>> Please change uc ? ... to (uc == NULL) ? ...
>>>
>>> 122: (and os_linux_s390.cpp:513)
>>>
>>> Space missed after (address) ...
>>>
>>>
>>> -Dmitry
>>>
>>>
>>> On 2017-03-21 16:40, Thomas Stüfe wrote:
>>> > Hi all,
>>> >
>>> > please take a look at this tiny fix. It fixes the
>>> pc shown
>>> as faulting
>>> > address for SIGILL and SIGFPE in hs_err files.
>>> >
>>> > https://bugs.openjdk.java.net/browse/JDK-8176872
>>> <https://bugs.openjdk.java.net/browse/JDK-8176872>
>>> <https://bugs.openjdk.java.net/browse/JDK-8176872
>>> <https://bugs.openjdk.java.net/browse/JDK-8176872>>
>>> <https://bugs.openjdk.java.net/browse/JDK-8176872
>>> <https://bugs.openjdk.java.net/browse/JDK-8176872>
>>> <https://bugs.openjdk.java.net/browse/JDK-8176872
>>> <https://bugs.openjdk.java.net/browse/JDK-8176872>>>
>>> >
>>>
>>>
>>>
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wron
>>> g-pc-in-errorlogs/jdk10-webrev.00/webrev/
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.00/webrev/>
>>>
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.00/webrev/
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.00/webrev/>>
>>>
>>>
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.00/webrev/
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.00/webrev/>
>>>
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.00/webrev/
>>>
>>>
>>> <http://cr.openjdk.java.net/~stuefe/webrevs/8176872-s390-wro
>>> ng-pc-in-errorlogs/jdk10-webrev.00/webrev/>>>
>>>
>>> >
>>> > When determining the crash pc, in all posix
>>> platform signal
>>> handlers pc is
>>> > taken from the context. However, context.pc on
>>> zlinux
>>> points to the
>>> > instruction *after* the faulting op. The correct
>>> way,
>>> according to
>>> POSIX,
>>> > would be to take the address from
>>> siginfo_t->si_addr for
>>> signals
>>> SIGILL,
>>> > SIGFPE.
>>> >
>>> > (actually, this patch would make sense for all
>>> POSIX
>>> platforms,
>>> but only
>>> > s390 seems to show this error, so I leave the
>>> patch local
>>> to s390.)
>>> >
>>> > Kind Regards, Thomas
>>> >
>>>
>>>
>>> --
>>> Dmitry Samersoff
>>> Oracle Java development team, Saint Petersburg,
>>> Russia
>>> * I would love to change the world, but they won't
>>> give me
>>> the sources.
>>>
>>>
>>>
>>>
>>>
More information about the hotspot-runtime-dev
mailing list