RFR[13, xs]: 8227275: Within native OOM error handling, assertions may hang the process

Thomas Stüfe thomas.stuefe at gmail.com
Wed Jul 10 05:38:11 UTC 2019


Hi Coleen,

thanks for looking at it! Remarks below.

On Tue, Jul 9, 2019 at 11:12 PM <coleen.phillimore at oracle.com> wrote:

>
>
> http://cr.openjdk.java.net/~stuefe/webrevs/8227275-native-oom-hanging-assertions/webrev.00/webrev/src/hotspot/share/utilities/debug.cpp.udiff.html
>
> I don't understand why you don't just leave the poison page PROT_NONE
> and call this from handle_assert_poison_fault?
>
> +void disarm_assert_poison() {
> + g_assert_poison = &g_dummy;
> +}
> +
>
> Then you don't have to check that it succeeded.   At this point, it
> doesn't matter.
>
>
Because unfortunately that does not work. At the point it is too late.

handle_assert_poison_fault() is called from the signal handler to handle a
poison touch SIGSEGV. When it is handled, it will return to the caller -
jumps back to the instruction triggering the SEGV. There, poison page
address is already loaded into a register and cannot be changed.

The only other choice we have, beside removing the write protection from
the poison page, is not to return to the caller. That is what happens when
I return false from handle_assert_poison_fault(). In that case the signal
handler proceeds as if this were a real SEGV.

Cheers, Thomas


Coleen
>
> On 7/9/19 5:23 AM, Thomas Stüfe wrote:
> > Dear all,
> >
> > may I please have reviews for the following issue:
> >
> > JBS: https://bugs.openjdk.java.net/browse/JDK-8227275
> > cr:
> >
> http://cr.openjdk.java.net/~stuefe/webrevs/8227275-native-oom-hanging-assertions/webrev.00/webrev/
> >
> > Summary: on OOM, we may fail to disarm assertion poison page; this may
> lead
> > to endless loops during error handling if assertions happen in native OOM
> > scenarios.
> >
> > For more details, pls see the JBS issue.
> >
> > Thanks, Thomas
>
>


More information about the hotspot-runtime-dev mailing list