RFR (xs) 8075967: Zero interpreter asserts for SafeFetch<32, N> calls in ObjectMonitor

Thu Mar 26 13:37:34 UTC 2015

Hi Severin,

I think a CanUseSafeFetch() which returns false on zero but which could be
always called unconditionally would be fine. The user would have to guard
SafeFetch() with CanUseSafeFetch() and think about what to do if
SafeFetch() was not available.

On Zero, for POSIX platforms one could implement SafeFetch with
setjmp()/longjmp(). Before accessing the potentially invalid pointer, do a
setjmp() and in the signal handler a longjmp() back to SafeFetch(). This
would have to be threadsafe - by keeping the jmp_buf thread local or just
by synchronizing access to SafeFetch().

For Zero on windows, it would be very easy, just guard SafeFetch with
__try/__catch.

..Thomas

On Thu, Mar 26, 2015 at 2:00 PM, Severin Gehwolf <sgehwolf at redhat.com>
wrote:

> Hi,
>
> On Thu, 2015-03-26 at 22:38 +1000, David Holmes wrote:
> > On 26/03/2015 10:24 PM, Thomas Stüfe wrote:
> > > Hi David,
> > >
> > > On Thu, Mar 26, 2015 at 11:42 AM, David Holmes <
> david.holmes at oracle.com
> > > <mailto:david.holmes at oracle.com>> wrote:
> > >
> > >     Hi Thomas,
> > >
> > >     On 26/03/2015 6:35 PM, Thomas Stüfe wrote:
> > >
> > >         Hi,
> > >
> > >         I agree with Goetz.
> > >
> > >         SafeFetch() was meant to be guarded with CanUseSafeFetch(),
> this
> > >         is part
> > >         of the contract, so using it without this guard is actually
> > >         incorrect
> > >         and would crash were the calling code ever called before Stub
> > >         routine
> > >         initialization.
> > >
> > >
> > >     That contract was already implicit - this API is for use after the
> > >     stubs have been generated. There's no reason to expand the use of
> > >     this API to situations before stubs are generated. So the assert
> > >     captures that and CanUseSafeFetch is really unnecessary.
> > >
> > >
> > > We had to introduce CanUseSafeFetch() exactly because we have
> situations
> > > where we may call SafeFetch before stub routine generation:
> > >
> > > 1) In os_aix.cpp, we use it to probe protected memory after doing
> > > mprotect(). This is a long story, but we had to add this workaround
> > > because on shortcomings in AIX's mprotect(). This code will get called
> > > during initialization before stub routines are set up.
> > > Note that right now, in the OpenJDK code base, the code still directly
> > > checks whether the stubroutine is NULL, because that mprotect
> workaround
> > > is older than 8074552. But I planned to change this to
> > > CanUseSafeFetch(). But with the new CanUseSafeFetch, this would just
> > > break AIX.
> > >
> > > 2) In SAPs code base, we also use it to harden error log writing (e.g.
> > > to make callstack dumping more reliable). This may can be invoked very
> > > early, if a crash happens very early, before stub routines are
> > > available. We planned to contribute those improvements piece by piece
> to
> > > the OpenJDK - my first change for 8074552 was actually supposed to
> > > prepare that. Now, however, this makes no sense, because using
> > > SafeFetch() to harden error log writing would actually be worse for
> > > error logs for early crashes.
> > >
> > > SafeFetch is a very basic function which could be useful in many more
> > > contexts, but only if you can really be sure it is safe. The contexts
> it
> > > is used in may be very low level where you cannot be sure at which
> point
> > > in VM initialization you are.
> > >
> > > When I introduced CanUseSafeFetch(), it was to safeguard the use of
> > > SafeFetch() for situations where SafeFetch was not safe to call.
> Exactly
> > > so that I would have not to think about "is this too early to use
> > > SafeFetch()?". This is broken now - instead of guarding me against
> using
> > > SafeFetch() in situations where it is not safe to use, it will
> > > assert(debug) or potentially lie(release), in which case SafeFetch
> > > itself may crash.
> >
> > Okay I see a case for adding a check for "can I use SafeFetch at this
> > point in time" (though if what you are fetching is potentially not-safe
> > then you are in trouble!).
> >
> > >         So now
> > >         - CanUseSafeFetch really now means "SafeFetch will behave as
> > >         excpected"
> > >         - SafeFetch is softened to "will safely fetch content of
> address
> > >         unless on zero, there it may crash the VM".
> > >
> > >         I think this is more confusing than before. In objectMonitor,
> > >         the logic
> > >         we want is "use SafeFetch wherever possible but if not, just
> > >         access the
> > >         memory directly". This logic now is hidden behind a seemingly
> > >         incorrect
> > >         use of SafeFetch without CanUseSafeFetch, and I think Goetz
> proposal
> > >         would have been far more clearer.
> > >
> > >
> > >     No the problem is that SafeFetch doesn't work on ZERO - if you give
> > >     it a bad address it won't recover it will crash. The whole point is
> > >     for it not to crash. The objectMonitor code is saying "here is a
> > >     load that we know may be invalid, but if it is then we don't want
> to
> > >     crash, so we use SafeFetch".
> > >
> > >
> > > Yes! But the way it is coded, it does not. Because SafeFetch is now
> > > implemented on zero with a simple load, it is not safe at all. The code
> > > in objectMonitor.cpp appears to be safe, but it is not. And SafeFetch()
> > > promises to be safe, but it is not, and you would not know unless you
> > > know the zero implementation.
> >
> > Sorry did I miss something - has Zero ever implemented SafeFetch as
> > other than a direct simple load? ie hasn't Zero always been broken wrt
> > SafeFetch and that is the basic problem ?
>
> Yes, Zero's implementation was broken before and that's the basic
> problem. By re-introducing SafeFetch for Zero with the old code, that
> part of the code is of course not fixed.
>
> Here is the dilemma, though, how would a correct SafeFetch
> implementation for Zero look like? It *cannot* use ucontext's mcontext
> as this gets us into machine dependent code and a Zero JVM may be
> compiled on any architecture where there is a GCC (e.g. arm32, ppc32).
> That's the whole point of Zero: To get a JVM on an architecture where
> there is no JIT port. What's more, those architectures where Zero might
> get compiled on are not known ahead of time.
>
> > > It seems to be a bit inconsequent. Either one is really sure the
> pointer
> > > is accessible - in which case why use SafeFetch? - or it may be
> invalid,
> > > even if this is unusual - in which case why is the resulting crash not
> a
> > > problem for zero?
> > > In other words, if the original programmer thought it worth to guard
> the
> > > memory access with SafeFetch(), why is this not a concern for zero?
> >
> > It is - which is why Zero is broken.
> >
> > >     The expectation is that SafeFetch actually is safe no matter what
> is
> > >     fetched.
> > >
> > >
> > > Yes, and that expectation is broken now.
> >
> > When was it not broken?
> >
> > > Before, if you used SafeFetch() and guarded it with CanUseSafeFetch(),
> > > you could be sure to be safe and not to break anything - regardless
> when
> > > in VM life you used the function or on which platform.
> >
> > No you only knew that a function called SafeFetch could be called - you
> > didn't know it was actually implemented to actually always do a "safe
> > fetch".
> >
> > > Both SafeFetch and CanUseSafeFetch made promises:
> > > - CanUseSafeFetch promised to tell me if it is too early to call
> > > SafeFetch. Of course, it itself could be called at all times, that was
> > > the whole point of CanUseSafeFetch.
> > > - SafeFetch promised to be able to access a potentially invalid pointer
> > > and not to crash - if CanUseSafeFetch returned true.
> >
> > SafeFetch (if it exists) promises to be safe. The ZERO implementation
> > breaks that.
> >
> > > Both promises are broken now. CanUseSafeFetch is not safe to call if it
> > > is too early - ironically a condition it was designed to tell me.
> >
> > Okay we will re-fix that.
> >
> > > SafeFetch is not safe to call because it lies about being safe for
> zero.
> >
> > It never was safe to call as far as I can see, it was just never called
> > with an invalid value. The new test exposed this.
>
> I think the problem right now is that the current implementation of
> CanUseSafeFetch can no longer be called unconditionally at any point in
> time of JVM init on NotZero() JVMs.
>
> Cheers,
> Severin
>
> > David
> > -----
> >
> > > The result is that either if we continue to use SafeFetch, we risk
> > > crashes on zero and crashes if invoked too early. This makes the API
> > > useless for most cases.
> > >
> > > I am open to any other suggestions - renaming the APIs, maybe reshaping
> > > them - but I regret seeing SafeFetch being back to being quite unsafe
> again.
> > >
> > > Kind Regards, Thomas
> > >
> > >
> > >     The ZERO implementation just does a load and hopes it isn't asked
> > >     for anything bad. The objectMonitor code will rarely actually
> > >     trigger a bad load. But the newly introduced test does.
> > >
> > >     So in my opinion to return to the status-quo where ZERO really
> > >     doesn't implement SafeFetch safely but it "works okay in practice",
> > >     we should disable those explicit tests when running on ZERO.
> > >
> > >     Whether CanUseSafeFetch (aka is SafeFetchStubAvailable) is
> > >     implemented explicitly or implicitly as an assert is a side-issue.
> > >
> > >     David
> > >
> > >         Kind Regards, Thomas
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >         On Thu, Mar 26, 2015 at 8:40 AM, Lindenmaier, Goetz
> > >         <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>
> > >         <mailto:goetz.lindenmaier at sap.__com
> > >         <mailto:goetz.lindenmaier at sap.com>>> wrote:
> > >
> > >              Hi,
> > >
> > >              sorry I only comment on this now ...
> > >              I think it's rather strange that you can not use SafeFetch
> > >         on zero,
> > >              but it's there,
> > >              and it's actually used ... which will be the situation
> > >         after this
> > >              change.
> > >
> > >              Why not just change objectMonitor as the SafeFetch api is
> > >         meant to
> > >              be used?
> > >              Obviously zero takes care that this access always
> succeeds:
> > >
> > >              diff -r a35854c84d9d
> src/share/vm/runtime/__objectMonitor.cpp
> > >              --- a/src/share/vm/runtime/__objectMonitor.cpp    Wed Mar
> > >         25 12:36:28
> > >              2015 +0100
> > >              +++ b/src/share/vm/runtime/__objectMonitor.cpp    Thu Mar
> > >         26 08:19:53
> > >              2015 +0100
> > >              @@ -2241,7 +2241,12 @@
> > >                  }
> > >
> > >                  assert(sizeof(((JavaThread *)ox)->_thread_state ==
> > >         sizeof(int)),
> > >              "invariant");
> > >              -  int jst = SafeFetch32((int *) &((JavaThread *)
> > >              ox)->_thread_state, -1);;
> > >              +  int jst;
> > >              +  if (CanUseSafeFetch32()) {
> > >              +    jst = SafeFetch32((int *) &((JavaThread *)
> > >         ox)->_thread_state, -1);
> > >              +  } else {
> > >              +    jst = ((JavaThread *)ox)->_thread_state;
> > >              +  }
> > >                  // consider also: jst != _thread_in_Java -- but that's
> > >         overspecific.
> > >                  return jst == _thread_blocked || jst ==
> _thread_in_native;
> > >                }
> > >
> > >              It's handled similarly in vmError.cpp.  And no #ifdefs ;)
> > >
> > >              But I'm also fine with the other change ... to get zero
> > >         working again.
> > >
> > >              Best regards,
> > >                 Goetz.
> > >
> > >
> > >
> > >              -----Original Message-----
> > >              From: hotspot-dev
> > >         [mailto:hotspot-dev-bounces at __openjdk.java.net
> > >         <mailto:hotspot-dev-bounces at openjdk.java.net>
> > >              <mailto:hotspot-dev-bounces at __openjdk.java.net
> > >         <mailto:hotspot-dev-bounces at openjdk.java.net>>] On Behalf Of
> Coleen
> > >              Phillimore
> > >              Sent: Thursday, March 26, 2015 3:30 AM
> > >              To: David Holmes; hotspot-dev developers
> > >              Subject: Re: RFR (xs) 8075967: Zero interpreter asserts
> for
> > >              SafeFetch<32, N> calls in ObjectMonitor
> > >
> > >
> > >              On 3/25/15, 10:15 PM, David Holmes wrote:
> > >               > On 26/03/2015 11:00 AM, Coleen Phillimore wrote:
> > >               >>
> > >               >> On 3/25/15, 8:21 PM, David Holmes wrote:
> > >               >>> On 26/03/2015 10:04 AM, Coleen Phillimore wrote:
> > >               >>>>
> > >               >>>> On 3/25/15, 7:59 PM, David Holmes wrote:
> > >               >>>>> Hi Coleen,
> > >               >>>>>
> > >               >>>>> Why generate stubs that can't be used and then add
> > >         zero-specific
> > >               >>>>> logic
> > >               >>>>> to the shared CanUseSafeFetchN instead of not
> > >         generating the
> > >              stubs so
> > >               >>>>> that CanUseSafeFetchN will return false anyway ??
> > >               >>>>
> > >               >>>> Because there is platform independent code in
> > >              objectMonitor.cpp that
> > >               >>>> uses SafeFetchX (both).  I'd rather not burden the
> > >         caller of
> > >              this with
> > >               >>>> testing CanUseSafeFetchX for each call.   This code
> > >         existed before
> > >               >>>> SafeFetch was implemented, so I'm restoring previous
> > >         behavior for
> > >               >>>> Zero.
> > >               >>>
> > >               >>> Sorry Coleen I thought this was deal to with 8075533,
> > >         I hadn't
> > >               >>> realized 8075533 broke the objectMonitor code and
> this
> > >         was a
> > >              follow up.
> > >               >>>
> > >               >>> What a mess. :(
> > >               >>
> > >               >> Yes, it is.  Is this a code Review?
> > >               >
> > >               > Let me work through this :) The original change for
> 8074552:
> > >               >
> > >               >
> > >
> http://hg.openjdk.java.net/__jdk9/hs-rt/hotspot/rev/__3eb61269f421
> > >         <
> http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/3eb61269f421>
> > >               >
> > >               > added the extra error handling stuff related to
> > >         SafeFetch and a new
> > >               > ErrorHandler test that was predicated on
> > >         CanUseSafeFetch32 which
> > >               > returns true if the stubs exist. It also added testing
> of
> > >              SafeFetch in
> > >               > StubRoutines::initialize2 which was excluded for
> > >         windows-32-bit and
> > >               > which did not check CanUseSafeFetch32.
> > >               >
> > >               > That change broke zero as reported in 8075533 because
> > >         you can't
> > >               > actually use the SafeFetch routines with an unsafe
> > >         address on Zero,
> > >               > and the solution:
> > >               >
> > >               >
> > >
> http://hg.openjdk.java.net/__jdk9/hs-rt/hotspot/rev/__5c2bc6278fc4
> > >         <
> http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/5c2bc6278fc4>
> > >               >
> > >               > was to remove the stubs so that CanUseSafeFetch32 would
> > >         return false,
> > >               > and to include a check of CanUseSafeFetch32 in the
> tests
> > >         used with
> > >               > StubRoutines::initialize2.
> > >               >
> > >               > And now we find that the SafeFetch routines are used
> > >         unconditionally
> > >               > by the objectMonitor code, so we need to restore the
> > >         stubs that were
> > >               > previously removed, but force CanUseSafeFetch32 to
> > >         return false
> > >              on zero.
> > >
> > >              Yes, exactly.
> > >               >
> > >               > Ok. Only nit is that:
> > >               >
> > >               > return NOT_ZERO(true) ZERO_ONLY(false);
> > >
> > >              Yes, you're right.  I'll make that change.
> > >
> > >              Thanks!
> > >              Coleen
> > >
> > >               >
> > >               > would be more readable than the ifdefs.
> > >               >
> > >               > Thanks,
> > >               > David
> > >               >
> > >               >> thanks!
> > >               >> Coleen
> > >               >>>
> > >               >>> Thanks,
> > >               >>> David
> > >               >>>
> > >               >>>
> > >               >>>
> > >               >>>> We could file an RFE to either implement SafeFetch
> > >         for Zero or
> > >              rewrite
> > >               >>>> this objectMonitor code to not need SafeFetch.  I
> > >         didn't want
> > >              to do
> > >               >>>> either of these things with this patch.
> > >               >>>>
> > >               >>>> Coleen
> > >               >>>>
> > >               >>>>>
> > >               >>>>> Thanks,
> > >               >>>>> David
> > >               >>>>>
> > >               >>>>> On 26/03/2015 3:53 AM, Coleen Phillimore wrote:
> > >               >>>>>> Summary: Implement SafeFetchX unsafely and make
> > >         CanUseSafeFetchX
> > >               >>>>>> false
> > >               >>>>>> for Zero
> > >               >>>>>>
> > >               >>>>>> Also, this fixes a couple other minor issues.
> > >               >>>>>>
> > >               >>>>>> Ran jdk9 jck tests with one timeout.
> > >         hotspot/runtime jtreg
> > >              tests
> > >               >>>>>> don't
> > >               >>>>>> run because Zero doesn't support UseCompressedOops
> > >         (not sure
> > >              why)
> > >               >>>>>> and
> > >               >>>>>> CDS (know why).
> > >               >>>>>>
> > >               >>>>>> open webrev at
> > >         http://cr.openjdk.java.net/~__coleenp/8075967/
> > >         <http://cr.openjdk.java.net/~coleenp/8075967/>
> > >               >>>>>> bug link
> > >         https://bugs.openjdk.java.net/__browse/JDK-8075967
> > >         <https://bugs.openjdk.java.net/browse/JDK-8075967>
> > >               >>>>>>
> > >               >>>>>> Thanks,
> > >               >>>>>> Coleen
> > >               >>>>
> > >               >>
> > >
> > >
> > >
>
>
>
>