RFR[13, xs]: 8227275: Within native OOM error handling, assertions may hang the process

Thomas Stüfe thomas.stuefe at gmail.com
Wed Jul 10 09:59:23 UTC 2019


Thanks Martin!


On Wed, Jul 10, 2019 at 11:45 AM Doerr, Martin <martin.doerr at sap.com> wrote:

> Hi Thomas,
>
> thanks for the explanations. Your fix looks good to me (except the missing
> include in vmError.cpp we discussed offline).
>
>
The include was not missing; instead, the call to disarm_poison_page() must
be enclosed in #ifdef CAN_SHOW_REGISTERS_ON_ASSERT. I fixed the patch in
place.

Cheers Thomas



> Best regards,
> Martin
>
>
> > -----Original Message-----
> > From: hotspot-runtime-dev <hotspot-runtime-dev-
> > bounces at openjdk.java.net> On Behalf Of Thomas Stüfe
> > Sent: Mittwoch, 10. Juli 2019 07:38
> > To: Coleen Phillmore <coleen.phillimore at oracle.com>
> > Cc: Hotspot dev runtime <hotspot-runtime-dev at openjdk.java.net>
> > Subject: Re: RFR[13, xs]: 8227275: Within native OOM error handling,
> > assertions may hang the process
> >
> > Hi Coleen,
> >
> > thanks for looking at it! Remarks below.
> >
> > On Tue, Jul 9, 2019 at 11:12 PM <coleen.phillimore at oracle.com> wrote:
> >
> > >
> > >
> > > http://cr.openjdk.java.net/~stuefe/webrevs/8227275-native-oom-
> > hanging-
> > assertions/webrev.00/webrev/src/hotspot/share/utilities/debug.cpp.udiff.h
> > tml
> > >
> > > I don't understand why you don't just leave the poison page PROT_NONE
> > > and call this from handle_assert_poison_fault?
> > >
> > > +void disarm_assert_poison() {
> > > + g_assert_poison = &g_dummy;
> > > +}
> > > +
> > >
> > > Then you don't have to check that it succeeded.   At this point, it
> > > doesn't matter.
> > >
> > >
> > Because unfortunately that does not work. At the point it is too late.
> >
> > handle_assert_poison_fault() is called from the signal handler to handle
> a
> > poison touch SIGSEGV. When it is handled, it will return to the caller -
> > jumps back to the instruction triggering the SEGV. There, poison page
> > address is already loaded into a register and cannot be changed.
> >
> > The only other choice we have, beside removing the write protection from
> > the poison page, is not to return to the caller. That is what happens
> when
> > I return false from handle_assert_poison_fault(). In that case the signal
> > handler proceeds as if this were a real SEGV.
> >
> > Cheers, Thomas
> >
> >
> > Coleen
> > >
> > > On 7/9/19 5:23 AM, Thomas Stüfe wrote:
> > > > Dear all,
> > > >
> > > > may I please have reviews for the following issue:
> > > >
> > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8227275
> > > > cr:
> > > >
> > > http://cr.openjdk.java.net/~stuefe/webrevs/8227275-native-oom-
> > hanging-assertions/webrev.00/webrev/
> > > >
> > > > Summary: on OOM, we may fail to disarm assertion poison page; this
> may
> > > lead
> > > > to endless loops during error handling if assertions happen in
> native OOM
> > > > scenarios.
> > > >
> > > > For more details, pls see the JBS issue.
> > > >
> > > > Thanks, Thomas
> > >
> > >
>


More information about the hotspot-runtime-dev mailing list