RFR (S): 8239895: assert(_stack_base != 0LL) failed: Sanity check

Thomas Stüfe thomas.stuefe at gmail.com
Thu Apr 2 14:16:52 UTC 2020


Hi David,

sounds all reasonable. I am fine with your changes.

Cheers, Thomas

On Thu, Apr 2, 2020 at 3:57 PM David Holmes <david.holmes at oracle.com> wrote:

> Hi Thomas,
>
> Thanks for looking at this.
>
> On 2/04/2020 8:49 pm, Thomas Stüfe wrote:
> > Hi David,
> >
> > src/hotspot/share/runtime/thread.cpp
> >
> > I do not understand why you removed the assert in
> > Thread::record_stack_base_and_size() ? Is there any case where this
> > could be NULL?
>
> I added that assert under JDK-8241043 (which this change backs out
> again) as a way to try and track down where the failure was happening.
> There's really no way it could be NULL.
>
> > --
> > src/hotspot/os_cpu/solaris_x86/os_solaris_x86.cpp
> > src/hotspot/os_cpu/solaris_sparc/os_solaris_sparc.cpp
> >
> >
> > Isn't os::Solaris::valid_ucontext() called from signal handling, so the
> > Thread may have any state? In that case, it may be safer to ignore base
> > NULL.
>
> I restored things to the way they were before JDK-8238988. I thought it
> odd that the Solaris version of this code used the stack_base() call
> that checked for NULL, but it did so ...
>
> A new thread would have to take a signal before setting the stack-base
> for this to be an issue, which is an extremely short window so there is
> really little chance of SEGV.
>
> > --
> >
> > I wonder whether it would make more sense to do it the other way around:
> > "check-if-adress-is-in-stack-but-its-okay-if-there-is-none" seems more
> > of a special requirement.
>
> But that is exactly what on_local_stack() previously did.
>
> > In that case, Alternative naming proposal could follow our "_or_null"
> > scheme:
> >
> > is_in_full_stack_checked -> is_in_full_stack (asserts for base==0)
> > is_in_full_stack -> is_in_full_stack_or_null (returns false for base==0)
>
> Yes I initially thought about doing it from that side, but after some to
> and fro with Dan I decided to do it from the current side. Bit of a coin
> toss really. There were fewer changes needed this way.
>
> > But these are idle nitpickings, feel free to ignore them :)
>
> I hope I have given some explanation to address your comments :)
>
> Thanks,
> David
> -----
>
> >
> >
> > Cheers, Thomas
> >
> > On Wed, Apr 1, 2020 at 4:56 AM David Holmes <david.holmes at oracle.com
> > <mailto:david.holmes at oracle.com>> wrote:
> >
> >     Hi Dan,
> >
> >     Tiers 5 and 6 passed.
> >
> >     Can I get a second review please.
> >
> >     Thanks,
> >     David
> >
> >     On 1/04/2020 10:35 am, Daniel D. Daugherty wrote:
> >      > On 3/31/20 5:34 PM, David Holmes wrote:
> >      >> Hi Dan,
> >      >>
> >      >> On 1/04/2020 12:26 am, Daniel D. Daugherty wrote:
> >      >>> On 3/31/20 1:18 AM, David Holmes wrote:
> >      >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8239895
> >      >>>> webrev: http://cr.openjdk.java.net/~dholmes/8239895/webrev/
> >      >>>
> >      >>> src/hotspot/cpu/aarch64/frame_aarch64.cpp
> >      >>>      No comments.
> >      >>>
> >      >>> src/hotspot/cpu/arm/frame_arm.cpp
> >      >>>      No comments.
> >      >>>
> >      >>> src/hotspot/cpu/ppc/frame_ppc.cpp
> >      >>>      No comments.
> >      >>>
> >      >>> src/hotspot/cpu/s390/frame_s390.cpp
> >      >>>      No comments.
> >      >>>
> >      >>> src/hotspot/cpu/x86/frame_x86.cpp
> >      >>>      No comments.
> >      >>>
> >      >>> src/hotspot/os_cpu/solaris_sparc/os_solaris_sparc.cpp
> >      >>>      No comments.
> >      >>>
> >      >>> src/hotspot/os_cpu/solaris_x86/os_solaris_x86.cpp
> >      >>>      No comments.
> >      >>>
> >      >>> src/hotspot/share/runtime/thread.cpp
> >      >>>      No comments (8241043 backout).
> >      >>>
> >      >>> src/hotspot/share/runtime/thread.hpp
> >      >>>      No comments (new func and 8241043 backout).
> >      >>>
> >      >>>
> >      >>> Thumbs up.
> >      >>
> >      >> Thanks for the review!
> >      >>
> >      >>> More below...
> >      >>>
> >      >>>
> >      >>>> Prior to JDK-8238988 there were uses of stack_base() which
> >     checked
> >      >>>> it was initialized, and there was a raw use of _stack_base in
> >      >>>> on_local_stack() that did not need it to be initialized
> >     (because it
> >      >>>> may not be). After JDK-8238988 both cases call
> >     is_in_stack_range()
> >      >>>> which uses stack_base() and so asserts that the stack base is
> >      >>>> initialized in all cases. This leads to the assertion failures
> >     when
> >      >>>> the _stack_base is not initialised. The fix has three parts:
> >      >>>>
> >      >>>> 1. Rename is_in_full_stack to is_in_full_stack_checked - as it
> >      >>>> checks _stack_base is initialized via an assertion.
> >      >>>>
> >      >>>> 2. Add a new is_in_full_stack which doesn't use any assertions.
> >      >>>>
> >      >>>> 3. Update all the uses of stack_base() prior to JDK-8238988
> that
> >      >>>> were changed to call is_in_full_stack, to now call
> >      >>>> is_in_full_stack_checked. There are not many of them. (The
> >     corollary
> >      >>>> to that is that all old calls to on_local_stack() call the new
> >      >>>> unchecked is_in_full_stack.)
> >      >>>>
> >      >>>> Here's the webrev for JDK-8238988 for comparison if desired:
> >      >>>>
> >      >>>> http://cr.openjdk.java.net/~dholmes/8238988/webrev/
> >      >>>
> >      >>> The above link doesn't work. I used this one:
> >      >>>
> >      >>> http://cr.openjdk.java.net/~dholmes/8238988/webrev.v3/
> >      >>
> >      >> Oops sorry about that.
> >      >>
> >      >>>>
> >      >>>> I also backed out the assertion changes that I made under:
> >      >>>>
> >      >>>> https://bugs.openjdk.java.net/browse/JDK-8241043
> >      >>>>
> >      >>>> as they were failing due to the use of get_thread_name(). I've
> >     filed
> >      >>>> a separate RFE for that issue:
> >      >>>>
> >      >>>> https://bugs.openjdk.java.net/browse/JDK-8241403
> >      >>>>
> >      >>>> Testing: tiers 1 - 3
> >      >>>
> >      >>> The test failures that we've been seeing are happening in Tier5
> >     and
> >      >>> Tier6.
> >      >>
> >      >> My testing was to check nothing new was unexpectedly broken. The
> >     test
> >      >> failures we occasionally see in tiers 5 and 6 do not reproduce
> >     easily
> >      >> and I've never had a failure occur in numerous such runs when
> >     trying
> >      >> to reproduce this originally.
> >      >
> >      > Sorry. I forgot that the bug hasn't been reproducing for you.
> >      >
> >      >
> >      >> I will submit a tier 5/6 run while waiting for a second review.
> >      >
> >      > That sounds good to me.
> >      >
> >      > Dan
> >      >
> >      >
> >      >>
> >      >> Thanks,
> >      >> David
> >      >>
> >      >>> Dan
> >      >>>
> >      >>>
> >      >>>>
> >      >>>> Thanks,
> >      >>>> David
> >      >>>>
> >      >>>
> >      >
> >
>


More information about the hotspot-runtime-dev mailing list