RFR (xs) 8075967: Zero interpreter asserts for SafeFetch<32, N> calls in ObjectMonitor

Thu Mar 26 20:18:12 UTC 2015

On 27/03/2015 3:41 AM, Coleen Phillimore wrote:
>
> So I thought of changing the objectMonitor code to do
>    if (CanUseSafeFetchN) {
>      SafeFetchN(...)
>    } else {
>       just fetch unsafely
>    }

This serves no purpose and adds overheads in the monitor code. We don't 
need the "have the stubs been generated" check for this code. And the 
"can I use ZERO" check would be pointless: if safefetch on zero doesn't 
work then SafeFetch may crash; if you take the else then it may also 
crash - same result both code paths, so no point.

David

> There's also a SafeFetch32 call around that same place that would also
> have to be changed this way.
>
> But I thought it was messy to add this conditional code since the result
> for Zero would be the exactly the same in this place.  So I chose
> CanUseSafeFetch to mean SafeFetch isn't safe and protected code the test
> code that way and not the ObjectMonitor code.
>
> I don't mind if this is reversed as long as it's not messy and is tested.
>
> Coleen
>
> On 3/26/15, 9:46 AM, Thomas Stüfe wrote:
>> Hi David,
>>
>> you are right, it was also broken before. I missed that Coleen just
>> reintroduced an earlier version of the dummy SafeFetch()
>> implementation, sorry for that.
>>
>> I would be happy with a CanUseSafeFetch() which returns always false
>> on zero and false on other platforms before stub generation. That
>> would be always safe to call.
>>
>> Kind Regards, Thomas
>>
>>
>> On Thu, Mar 26, 2015 at 1:38 PM, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>>     On 26/03/2015 10:24 PM, Thomas Stüfe wrote:
>>
>>         Hi David,
>>
>>         On Thu, Mar 26, 2015 at 11:42 AM, David Holmes
>>         <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>>         <mailto:david.holmes at oracle.com
>>         <mailto:david.holmes at oracle.com>>> wrote:
>>
>>             Hi Thomas,
>>
>>             On 26/03/2015 6:35 PM, Thomas Stüfe wrote:
>>
>>                 Hi,
>>
>>                 I agree with Goetz.
>>
>>                 SafeFetch() was meant to be guarded with
>>         CanUseSafeFetch(), this
>>                 is part
>>                 of the contract, so using it without this guard is
>>         actually
>>                 incorrect
>>                 and would crash were the calling code ever called
>>         before Stub
>>                 routine
>>                 initialization.
>>
>>
>>             That contract was already implicit - this API is for use
>>         after the
>>             stubs have been generated. There's no reason to expand the
>>         use of
>>             this API to situations before stubs are generated. So the
>>         assert
>>             captures that and CanUseSafeFetch is really unnecessary.
>>
>>
>>         We had to introduce CanUseSafeFetch() exactly because we have
>>         situations
>>         where we may call SafeFetch before stub routine generation:
>>
>>         1) In os_aix.cpp, we use it to probe protected memory after doing
>>         mprotect(). This is a long story, but we had to add this
>>         workaround
>>         because on shortcomings in AIX's mprotect(). This code will
>>         get called
>>         during initialization before stub routines are set up.
>>         Note that right now, in the OpenJDK code base, the code still
>>         directly
>>         checks whether the stubroutine is NULL, because that mprotect
>>         workaround
>>         is older than 8074552. But I planned to change this to
>>         CanUseSafeFetch(). But with the new CanUseSafeFetch, this
>>         would just
>>         break AIX.
>>
>>         2) In SAPs code base, we also use it to harden error log
>>         writing (e.g.
>>         to make callstack dumping more reliable). This may can be
>>         invoked very
>>         early, if a crash happens very early, before stub routines are
>>         available. We planned to contribute those improvements piece
>>         by piece to
>>         the OpenJDK - my first change for 8074552 was actually supposed to
>>         prepare that. Now, however, this makes no sense, because using
>>         SafeFetch() to harden error log writing would actually be
>>         worse for
>>         error logs for early crashes.
>>
>>         SafeFetch is a very basic function which could be useful in
>>         many more
>>         contexts, but only if you can really be sure it is safe. The
>>         contexts it
>>         is used in may be very low level where you cannot be sure at
>>         which point
>>         in VM initialization you are.
>>
>>         When I introduced CanUseSafeFetch(), it was to safeguard the
>>         use of
>>         SafeFetch() for situations where SafeFetch was not safe to
>>         call. Exactly
>>         so that I would have not to think about "is this too early to use
>>         SafeFetch()?". This is broken now - instead of guarding me
>>         against using
>>         SafeFetch() in situations where it is not safe to use, it will
>>         assert(debug) or potentially lie(release), in which case SafeFetch
>>         itself may crash.
>>
>>
>>     Okay I see a case for adding a check for "can I use SafeFetch at
>>     this point in time" (though if what you are fetching is
>>     potentially not-safe then you are in trouble!).
>>
>>                 So now
>>                 - CanUseSafeFetch really now means "SafeFetch will
>>         behave as
>>                 excpected"
>>                 - SafeFetch is softened to "will safely fetch content
>>         of address
>>                 unless on zero, there it may crash the VM".
>>
>>                 I think this is more confusing than before. In
>>         objectMonitor,
>>                 the logic
>>                 we want is "use SafeFetch wherever possible but if
>>         not, just
>>                 access the
>>                 memory directly". This logic now is hidden behind a
>>         seemingly
>>                 incorrect
>>                 use of SafeFetch without CanUseSafeFetch, and I think
>>         Goetz proposal
>>                 would have been far more clearer.
>>
>>
>>             No the problem is that SafeFetch doesn't work on ZERO - if
>>         you give
>>             it a bad address it won't recover it will crash. The whole
>>         point is
>>             for it not to crash. The objectMonitor code is saying
>>         "here is a
>>             load that we know may be invalid, but if it is then we
>>         don't want to
>>             crash, so we use SafeFetch".
>>
>>
>>         Yes! But the way it is coded, it does not. Because SafeFetch
>>         is now
>>         implemented on zero with a simple load, it is not safe at all.
>>         The code
>>         in objectMonitor.cpp appears to be safe, but it is not. And
>>         SafeFetch()
>>         promises to be safe, but it is not, and you would not know
>>         unless you
>>         know the zero implementation.
>>
>>
>>     Sorry did I miss something - has Zero ever implemented SafeFetch
>>     as other than a direct simple load? ie hasn't Zero always been
>>     broken wrt SafeFetch and that is the basic problem ?
>>
>>         It seems to be a bit inconsequent. Either one is really sure
>>         the pointer
>>         is accessible - in which case why use SafeFetch? - or it may
>>         be invalid,
>>         even if this is unusual - in which case why is the resulting
>>         crash not a
>>         problem for zero?
>>         In other words, if the original programmer thought it worth to
>>         guard the
>>         memory access with SafeFetch(), why is this not a concern for
>>         zero?
>>
>>
>>     It is - which is why Zero is broken.
>>
>>             The expectation is that SafeFetch actually is safe no
>>         matter what is
>>             fetched.
>>
>>
>>         Yes, and that expectation is broken now.
>>
>>
>>     When was it not broken?
>>
>>         Before, if you used SafeFetch() and guarded it with
>>         CanUseSafeFetch(),
>>         you could be sure to be safe and not to break anything -
>>         regardless when
>>         in VM life you used the function or on which platform.
>>
>>
>>     No you only knew that a function called SafeFetch could be called
>>     - you didn't know it was actually implemented to actually always
>>     do a "safe fetch".
>>
>>         Both SafeFetch and CanUseSafeFetch made promises:
>>         - CanUseSafeFetch promised to tell me if it is too early to call
>>         SafeFetch. Of course, it itself could be called at all times,
>>         that was
>>         the whole point of CanUseSafeFetch.
>>         - SafeFetch promised to be able to access a potentially
>>         invalid pointer
>>         and not to crash - if CanUseSafeFetch returned true.
>>
>>
>>     SafeFetch (if it exists) promises to be safe. The ZERO
>>     implementation breaks that.
>>
>>         Both promises are broken now. CanUseSafeFetch is not safe to
>>         call if it
>>         is too early - ironically a condition it was designed to tell me.
>>
>>
>>     Okay we will re-fix that.
>>
>>         SafeFetch is not safe to call because it lies about being safe
>>         for zero.
>>
>>
>>     It never was safe to call as far as I can see, it was just never
>>     called with an invalid value. The new test exposed this.
>>
>>     David
>>     -----
>>
>>         The result is that either if we continue to use SafeFetch, we risk
>>         crashes on zero and crashes if invoked too early. This makes
>>         the API
>>         useless for most cases.
>>
>>         I am open to any other suggestions - renaming the APIs, maybe
>>         reshaping
>>         them - but I regret seeing SafeFetch being back to being quite
>>         unsafe again.
>>
>>         Kind Regards, Thomas
>>
>>
>>             The ZERO implementation just does a load and hopes it
>>         isn't asked
>>             for anything bad. The objectMonitor code will rarely actually
>>             trigger a bad load. But the newly introduced test does.
>>
>>             So in my opinion to return to the status-quo where ZERO really
>>             doesn't implement SafeFetch safely but it "works okay in
>>         practice",
>>             we should disable those explicit tests when running on ZERO.
>>
>>             Whether CanUseSafeFetch (aka is SafeFetchStubAvailable) is
>>             implemented explicitly or implicitly as an assert is a
>>         side-issue.
>>
>>             David
>>
>>                 Kind Regards, Thomas
>>
>>
>>
>>
>>
>>
>>
>>                 On Thu, Mar 26, 2015 at 8:40 AM, Lindenmaier, Goetz
>>                 <goetz.lindenmaier at sap.com
>>         <mailto:goetz.lindenmaier at sap.com>
>>         <mailto:goetz.lindenmaier at sap.com
>>         <mailto:goetz.lindenmaier at sap.com>>
>>                 <mailto:goetz.lindenmaier at sap.
>>         <mailto:goetz.lindenmaier at sap.>__com
>>                 <mailto:goetz.lindenmaier at sap.com
>>         <mailto:goetz.lindenmaier at sap.com>>>> wrote:
>>
>>                      Hi,
>>
>>                      sorry I only comment on this now ...
>>                      I think it's rather strange that you can not use
>>         SafeFetch
>>                 on zero,
>>                      but it's there,
>>                      and it's actually used ... which will be the
>>         situation
>>                 after this
>>                      change.
>>
>>                      Why not just change objectMonitor as the
>>         SafeFetch api is
>>                 meant to
>>                      be used?
>>                      Obviously zero takes care that this access always
>>         succeeds:
>>
>>                      diff -r a35854c84d9d
>>         src/share/vm/runtime/__objectMonitor.cpp
>>                      --- a/src/share/vm/runtime/__objectMonitor.cpp
>>         Wed Mar
>>                 25 12:36:28
>>                      2015 +0100
>>                      +++ b/src/share/vm/runtime/__objectMonitor.cpp
>>         Thu Mar
>>
>>                 26 08:19:53
>>                      2015 +0100
>>                      @@ -2241,7 +2241,12 @@
>>                          }
>>
>>                          assert(sizeof(((JavaThread
>>         *)ox)->_thread_state ==
>>                 sizeof(int)),
>>                      "invariant");
>>                      -  int jst = SafeFetch32((int *) &((JavaThread *)
>>                      ox)->_thread_state, -1);;
>>                      +  int jst;
>>                      +  if (CanUseSafeFetch32()) {
>>                      +    jst = SafeFetch32((int *) &((JavaThread *)
>>                 ox)->_thread_state, -1);
>>                      +  } else {
>>                      +    jst = ((JavaThread *)ox)->_thread_state;
>>                      +  }
>>                          // consider also: jst != _thread_in_Java --
>>         but that's
>>                 overspecific.
>>                          return jst == _thread_blocked || jst ==
>>         _thread_in_native;
>>                        }
>>
>>                      It's handled similarly in vmError.cpp. And no
>>         #ifdefs ;)
>>
>>                      But I'm also fine with the other change ... to
>>         get zero
>>                 working again.
>>
>>                      Best regards,
>>                         Goetz.
>>
>>
>>
>>                      -----Original Message-----
>>                      From: hotspot-dev
>>                 [mailto:hotspot-dev-bounces@
>>         <mailto:hotspot-dev-bounces@>__openjdk.java.net
>>         <http://openjdk.java.net>
>>                 <mailto:hotspot-dev-bounces at openjdk.java.net
>>         <mailto:hotspot-dev-bounces at openjdk.java.net>>
>>                      <mailto:hotspot-dev-bounces@
>>         <mailto:hotspot-dev-bounces@>__openjdk.java.net
>>         <http://openjdk.java.net>
>>                 <mailto:hotspot-dev-bounces at openjdk.java.net
>>         <mailto:hotspot-dev-bounces at openjdk.java.net>>>] On Behalf Of
>>         Coleen
>>                      Phillimore
>>                      Sent: Thursday, March 26, 2015 3:30 AM
>>                      To: David Holmes; hotspot-dev developers
>>                      Subject: Re: RFR (xs) 8075967: Zero interpreter
>>         asserts for
>>                      SafeFetch<32, N> calls in ObjectMonitor
>>
>>
>>                      On 3/25/15, 10:15 PM, David Holmes wrote:
>>                       > On 26/03/2015 11:00 AM, Coleen Phillimore wrote:
>>                       >>
>>                       >> On 3/25/15, 8:21 PM, David Holmes wrote:
>>                       >>> On 26/03/2015 10:04 AM, Coleen Phillimore wrote:
>>                       >>>>
>>                       >>>> On 3/25/15, 7:59 PM, David Holmes wrote:
>>                       >>>>> Hi Coleen,
>>                       >>>>>
>>                       >>>>> Why generate stubs that can't be used and
>>         then add
>>                 zero-specific
>>                       >>>>> logic
>>                       >>>>> to the shared CanUseSafeFetchN instead of not
>>                 generating the
>>                      stubs so
>>                       >>>>> that CanUseSafeFetchN will return false
>>         anyway ??
>>                       >>>>
>>                       >>>> Because there is platform independent code in
>>                      objectMonitor.cpp that
>>                       >>>> uses SafeFetchX (both).  I'd rather not
>>         burden the
>>                 caller of
>>                      this with
>>                       >>>> testing CanUseSafeFetchX for each call.
>>          This code
>>                 existed before
>>                       >>>> SafeFetch was implemented, so I'm restoring
>>         previous
>>                 behavior for
>>                       >>>> Zero.
>>                       >>>
>>                       >>> Sorry Coleen I thought this was deal to with
>>         8075533,
>>                 I hadn't
>>                       >>> realized 8075533 broke the objectMonitor
>>         code and this
>>                 was a
>>                      follow up.
>>                       >>>
>>                       >>> What a mess. :(
>>                       >>
>>                       >> Yes, it is.  Is this a code Review?
>>                       >
>>                       > Let me work through this :) The original
>>         change for 8074552:
>>                       >
>>                       >
>>         http://hg.openjdk.java.net/__jdk9/hs-rt/hotspot/rev/__3eb61269f421
>>
>>         <http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/3eb61269f421>
>>                       >
>>                       > added the extra error handling stuff related to
>>                 SafeFetch and a new
>>                       > ErrorHandler test that was predicated on
>>                 CanUseSafeFetch32 which
>>                       > returns true if the stubs exist. It also added
>>         testing of
>>                      SafeFetch in
>>                       > StubRoutines::initialize2 which was excluded for
>>                 windows-32-bit and
>>                       > which did not check CanUseSafeFetch32.
>>                       >
>>                       > That change broke zero as reported in 8075533
>>         because
>>                 you can't
>>                       > actually use the SafeFetch routines with an unsafe
>>                 address on Zero,
>>                       > and the solution:
>>                       >
>>                       >
>>         http://hg.openjdk.java.net/__jdk9/hs-rt/hotspot/rev/__5c2bc6278fc4
>>
>>
>>
>>         <http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/5c2bc6278fc4>
>>                       >
>>                       > was to remove the stubs so that
>>         CanUseSafeFetch32 would
>>                 return false,
>>                       > and to include a check of CanUseSafeFetch32 in
>>         the tests
>>                 used with
>>                       > StubRoutines::initialize2.
>>                       >
>>                       > And now we find that the SafeFetch routines
>>         are used
>>                 unconditionally
>>                       > by the objectMonitor code, so we need to
>>         restore the
>>                 stubs that were
>>                       > previously removed, but force CanUseSafeFetch32 to
>>                 return false
>>                      on zero.
>>
>>                      Yes, exactly.
>>                       >
>>                       > Ok. Only nit is that:
>>                       >
>>                       > return NOT_ZERO(true) ZERO_ONLY(false);
>>
>>                      Yes, you're right.  I'll make that change.
>>
>>                      Thanks!
>>                      Coleen
>>
>>                       >
>>                       > would be more readable than the ifdefs.
>>                       >
>>                       > Thanks,
>>                       > David
>>                       >
>>                       >> thanks!
>>                       >> Coleen
>>                       >>>
>>                       >>> Thanks,
>>                       >>> David
>>                       >>>
>>                       >>>
>>                       >>>
>>                       >>>> We could file an RFE to either implement
>>         SafeFetch
>>                 for Zero or
>>                      rewrite
>>                       >>>> this objectMonitor code to not need
>>         SafeFetch.  I
>>                 didn't want
>>                      to do
>>                       >>>> either of these things with this patch.
>>                       >>>>
>>                       >>>> Coleen
>>                       >>>>
>>                       >>>>>
>>                       >>>>> Thanks,
>>                       >>>>> David
>>                       >>>>>
>>                       >>>>> On 26/03/2015 3:53 AM, Coleen Phillimore
>>         wrote:
>>                       >>>>>> Summary: Implement SafeFetchX unsafely
>>         and make
>>                 CanUseSafeFetchX
>>                       >>>>>> false
>>                       >>>>>> for Zero
>>                       >>>>>>
>>                       >>>>>> Also, this fixes a couple other minor issues.
>>                       >>>>>>
>>                       >>>>>> Ran jdk9 jck tests with one timeout.
>>                 hotspot/runtime jtreg
>>                      tests
>>                       >>>>>> don't
>>                       >>>>>> run because Zero doesn't support
>>         UseCompressedOops
>>                 (not sure
>>                      why)
>>                       >>>>>> and
>>                       >>>>>> CDS (know why).
>>                       >>>>>>
>>                       >>>>>> open webrev at
>>         http://cr.openjdk.java.net/~__coleenp/8075967/
>>         <http://cr.openjdk.java.net/%7E__coleenp/8075967/>
>>                 <http://cr.openjdk.java.net/~coleenp/8075967/
>>         <http://cr.openjdk.java.net/%7Ecoleenp/8075967/>>
>>                       >>>>>> bug link
>>         https://bugs.openjdk.java.net/__browse/JDK-8075967
>>                 <https://bugs.openjdk.java.net/browse/JDK-8075967>
>>                       >>>>>>
>>                       >>>>>> Thanks,
>>                       >>>>>> Coleen
>>                       >>>>
>>                       >>
>>
>>
>>
>>
>