RFR (xs) 8075967: Zero interpreter asserts for SafeFetch<32, N> calls in ObjectMonitor

Thu Mar 26 13:46:39 UTC 2015

Hi David,

you are right, it was also broken before. I missed that Coleen just
reintroduced an earlier version of the dummy SafeFetch() implementation,
sorry for that.

I would be happy with a CanUseSafeFetch() which returns always false on
zero and false on other platforms before stub generation. That would be
always safe to call.

Kind Regards, Thomas

On Thu, Mar 26, 2015 at 1:38 PM, David Holmes <david.holmes at oracle.com>
wrote:

> On 26/03/2015 10:24 PM, Thomas Stüfe wrote:
>
>> Hi David,
>>
>> On Thu, Mar 26, 2015 at 11:42 AM, David Holmes <david.holmes at oracle.com
>> <mailto:david.holmes at oracle.com>> wrote:
>>
>>     Hi Thomas,
>>
>>     On 26/03/2015 6:35 PM, Thomas Stüfe wrote:
>>
>>         Hi,
>>
>>         I agree with Goetz.
>>
>>         SafeFetch() was meant to be guarded with CanUseSafeFetch(), this
>>         is part
>>         of the contract, so using it without this guard is actually
>>         incorrect
>>         and would crash were the calling code ever called before Stub
>>         routine
>>         initialization.
>>
>>
>>     That contract was already implicit - this API is for use after the
>>     stubs have been generated. There's no reason to expand the use of
>>     this API to situations before stubs are generated. So the assert
>>     captures that and CanUseSafeFetch is really unnecessary.
>>
>>
>> We had to introduce CanUseSafeFetch() exactly because we have situations
>> where we may call SafeFetch before stub routine generation:
>>
>> 1) In os_aix.cpp, we use it to probe protected memory after doing
>> mprotect(). This is a long story, but we had to add this workaround
>> because on shortcomings in AIX's mprotect(). This code will get called
>> during initialization before stub routines are set up.
>> Note that right now, in the OpenJDK code base, the code still directly
>> checks whether the stubroutine is NULL, because that mprotect workaround
>> is older than 8074552. But I planned to change this to
>> CanUseSafeFetch(). But with the new CanUseSafeFetch, this would just
>> break AIX.
>>
>> 2) In SAPs code base, we also use it to harden error log writing (e.g.
>> to make callstack dumping more reliable). This may can be invoked very
>> early, if a crash happens very early, before stub routines are
>> available. We planned to contribute those improvements piece by piece to
>> the OpenJDK - my first change for 8074552 was actually supposed to
>> prepare that. Now, however, this makes no sense, because using
>> SafeFetch() to harden error log writing would actually be worse for
>> error logs for early crashes.
>>
>> SafeFetch is a very basic function which could be useful in many more
>> contexts, but only if you can really be sure it is safe. The contexts it
>> is used in may be very low level where you cannot be sure at which point
>> in VM initialization you are.
>>
>> When I introduced CanUseSafeFetch(), it was to safeguard the use of
>> SafeFetch() for situations where SafeFetch was not safe to call. Exactly
>> so that I would have not to think about "is this too early to use
>> SafeFetch()?". This is broken now - instead of guarding me against using
>> SafeFetch() in situations where it is not safe to use, it will
>> assert(debug) or potentially lie(release), in which case SafeFetch
>> itself may crash.
>>
>
> Okay I see a case for adding a check for "can I use SafeFetch at this
> point in time" (though if what you are fetching is potentially not-safe
> then you are in trouble!).
>
>          So now
>>         - CanUseSafeFetch really now means "SafeFetch will behave as
>>         excpected"
>>         - SafeFetch is softened to "will safely fetch content of address
>>         unless on zero, there it may crash the VM".
>>
>>         I think this is more confusing than before. In objectMonitor,
>>         the logic
>>         we want is "use SafeFetch wherever possible but if not, just
>>         access the
>>         memory directly". This logic now is hidden behind a seemingly
>>         incorrect
>>         use of SafeFetch without CanUseSafeFetch, and I think Goetz
>> proposal
>>         would have been far more clearer.
>>
>>
>>     No the problem is that SafeFetch doesn't work on ZERO - if you give
>>     it a bad address it won't recover it will crash. The whole point is
>>     for it not to crash. The objectMonitor code is saying "here is a
>>     load that we know may be invalid, but if it is then we don't want to
>>     crash, so we use SafeFetch".
>>
>>
>> Yes! But the way it is coded, it does not. Because SafeFetch is now
>> implemented on zero with a simple load, it is not safe at all. The code
>> in objectMonitor.cpp appears to be safe, but it is not. And SafeFetch()
>> promises to be safe, but it is not, and you would not know unless you
>> know the zero implementation.
>>
>
> Sorry did I miss something - has Zero ever implemented SafeFetch as other
> than a direct simple load? ie hasn't Zero always been broken wrt SafeFetch
> and that is the basic problem ?
>
>  It seems to be a bit inconsequent. Either one is really sure the pointer
>> is accessible - in which case why use SafeFetch? - or it may be invalid,
>> even if this is unusual - in which case why is the resulting crash not a
>> problem for zero?
>> In other words, if the original programmer thought it worth to guard the
>> memory access with SafeFetch(), why is this not a concern for zero?
>>
>
> It is - which is why Zero is broken.
>
>      The expectation is that SafeFetch actually is safe no matter what is
>>     fetched.
>>
>>
>> Yes, and that expectation is broken now.
>>
>
> When was it not broken?
>
>  Before, if you used SafeFetch() and guarded it with CanUseSafeFetch(),
>> you could be sure to be safe and not to break anything - regardless when
>> in VM life you used the function or on which platform.
>>
>
> No you only knew that a function called SafeFetch could be called - you
> didn't know it was actually implemented to actually always do a "safe
> fetch".
>
>  Both SafeFetch and CanUseSafeFetch made promises:
>> - CanUseSafeFetch promised to tell me if it is too early to call
>> SafeFetch. Of course, it itself could be called at all times, that was
>> the whole point of CanUseSafeFetch.
>> - SafeFetch promised to be able to access a potentially invalid pointer
>> and not to crash - if CanUseSafeFetch returned true.
>>
>
> SafeFetch (if it exists) promises to be safe. The ZERO implementation
> breaks that.
>
>  Both promises are broken now. CanUseSafeFetch is not safe to call if it
>> is too early - ironically a condition it was designed to tell me.
>>
>
> Okay we will re-fix that.
>
>  SafeFetch is not safe to call because it lies about being safe for zero.
>>
>
> It never was safe to call as far as I can see, it was just never called
> with an invalid value. The new test exposed this.
>
> David
> -----
>
>  The result is that either if we continue to use SafeFetch, we risk
>> crashes on zero and crashes if invoked too early. This makes the API
>> useless for most cases.
>>
>> I am open to any other suggestions - renaming the APIs, maybe reshaping
>> them - but I regret seeing SafeFetch being back to being quite unsafe
>> again.
>>
>> Kind Regards, Thomas
>>
>>
>>     The ZERO implementation just does a load and hopes it isn't asked
>>     for anything bad. The objectMonitor code will rarely actually
>>     trigger a bad load. But the newly introduced test does.
>>
>>     So in my opinion to return to the status-quo where ZERO really
>>     doesn't implement SafeFetch safely but it "works okay in practice",
>>     we should disable those explicit tests when running on ZERO.
>>
>>     Whether CanUseSafeFetch (aka is SafeFetchStubAvailable) is
>>     implemented explicitly or implicitly as an assert is a side-issue.
>>
>>     David
>>
>>         Kind Regards, Thomas
>>
>>
>>
>>
>>
>>
>>
>>         On Thu, Mar 26, 2015 at 8:40 AM, Lindenmaier, Goetz
>>         <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>
>>         <mailto:goetz.lindenmaier at sap.__com
>>         <mailto:goetz.lindenmaier at sap.com>>> wrote:
>>
>>              Hi,
>>
>>              sorry I only comment on this now ...
>>              I think it's rather strange that you can not use SafeFetch
>>         on zero,
>>              but it's there,
>>              and it's actually used ... which will be the situation
>>         after this
>>              change.
>>
>>              Why not just change objectMonitor as the SafeFetch api is
>>         meant to
>>              be used?
>>              Obviously zero takes care that this access always succeeds:
>>
>>              diff -r a35854c84d9d src/share/vm/runtime/__
>> objectMonitor.cpp
>>              --- a/src/share/vm/runtime/__objectMonitor.cpp    Wed Mar
>>         25 12:36:28
>>              2015 +0100
>>              +++ b/src/share/vm/runtime/__objectMonitor.cpp    Thu Mar
>>
>>         26 08:19:53
>>              2015 +0100
>>              @@ -2241,7 +2241,12 @@
>>                  }
>>
>>                  assert(sizeof(((JavaThread *)ox)->_thread_state ==
>>         sizeof(int)),
>>              "invariant");
>>              -  int jst = SafeFetch32((int *) &((JavaThread *)
>>              ox)->_thread_state, -1);;
>>              +  int jst;
>>              +  if (CanUseSafeFetch32()) {
>>              +    jst = SafeFetch32((int *) &((JavaThread *)
>>         ox)->_thread_state, -1);
>>              +  } else {
>>              +    jst = ((JavaThread *)ox)->_thread_state;
>>              +  }
>>                  // consider also: jst != _thread_in_Java -- but that's
>>         overspecific.
>>                  return jst == _thread_blocked || jst ==
>> _thread_in_native;
>>                }
>>
>>              It's handled similarly in vmError.cpp.  And no #ifdefs ;)
>>
>>              But I'm also fine with the other change ... to get zero
>>         working again.
>>
>>              Best regards,
>>                 Goetz.
>>
>>
>>
>>              -----Original Message-----
>>              From: hotspot-dev
>>         [mailto:hotspot-dev-bounces at __openjdk.java.net
>>         <mailto:hotspot-dev-bounces at openjdk.java.net>
>>              <mailto:hotspot-dev-bounces at __openjdk.java.net
>>         <mailto:hotspot-dev-bounces at openjdk.java.net>>] On Behalf Of
>> Coleen
>>              Phillimore
>>              Sent: Thursday, March 26, 2015 3:30 AM
>>              To: David Holmes; hotspot-dev developers
>>              Subject: Re: RFR (xs) 8075967: Zero interpreter asserts for
>>              SafeFetch<32, N> calls in ObjectMonitor
>>
>>
>>              On 3/25/15, 10:15 PM, David Holmes wrote:
>>               > On 26/03/2015 11:00 AM, Coleen Phillimore wrote:
>>               >>
>>               >> On 3/25/15, 8:21 PM, David Holmes wrote:
>>               >>> On 26/03/2015 10:04 AM, Coleen Phillimore wrote:
>>               >>>>
>>               >>>> On 3/25/15, 7:59 PM, David Holmes wrote:
>>               >>>>> Hi Coleen,
>>               >>>>>
>>               >>>>> Why generate stubs that can't be used and then add
>>         zero-specific
>>               >>>>> logic
>>               >>>>> to the shared CanUseSafeFetchN instead of not
>>         generating the
>>              stubs so
>>               >>>>> that CanUseSafeFetchN will return false anyway ??
>>               >>>>
>>               >>>> Because there is platform independent code in
>>              objectMonitor.cpp that
>>               >>>> uses SafeFetchX (both).  I'd rather not burden the
>>         caller of
>>              this with
>>               >>>> testing CanUseSafeFetchX for each call.   This code
>>         existed before
>>               >>>> SafeFetch was implemented, so I'm restoring previous
>>         behavior for
>>               >>>> Zero.
>>               >>>
>>               >>> Sorry Coleen I thought this was deal to with 8075533,
>>         I hadn't
>>               >>> realized 8075533 broke the objectMonitor code and this
>>         was a
>>              follow up.
>>               >>>
>>               >>> What a mess. :(
>>               >>
>>               >> Yes, it is.  Is this a code Review?
>>               >
>>               > Let me work through this :) The original change for
>> 8074552:
>>               >
>>               >
>>         http://hg.openjdk.java.net/__jdk9/hs-rt/hotspot/rev/__
>> 3eb61269f421
>>         <http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/3eb61269f421>
>>               >
>>               > added the extra error handling stuff related to
>>         SafeFetch and a new
>>               > ErrorHandler test that was predicated on
>>         CanUseSafeFetch32 which
>>               > returns true if the stubs exist. It also added testing of
>>              SafeFetch in
>>               > StubRoutines::initialize2 which was excluded for
>>         windows-32-bit and
>>               > which did not check CanUseSafeFetch32.
>>               >
>>               > That change broke zero as reported in 8075533 because
>>         you can't
>>               > actually use the SafeFetch routines with an unsafe
>>         address on Zero,
>>               > and the solution:
>>               >
>>               >
>>         http://hg.openjdk.java.net/__jdk9/hs-rt/hotspot/rev/__
>> 5c2bc6278fc4
>>
>>         <http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/5c2bc6278fc4>
>>               >
>>               > was to remove the stubs so that CanUseSafeFetch32 would
>>         return false,
>>               > and to include a check of CanUseSafeFetch32 in the tests
>>         used with
>>               > StubRoutines::initialize2.
>>               >
>>               > And now we find that the SafeFetch routines are used
>>         unconditionally
>>               > by the objectMonitor code, so we need to restore the
>>         stubs that were
>>               > previously removed, but force CanUseSafeFetch32 to
>>         return false
>>              on zero.
>>
>>              Yes, exactly.
>>               >
>>               > Ok. Only nit is that:
>>               >
>>               > return NOT_ZERO(true) ZERO_ONLY(false);
>>
>>              Yes, you're right.  I'll make that change.
>>
>>              Thanks!
>>              Coleen
>>
>>               >
>>               > would be more readable than the ifdefs.
>>               >
>>               > Thanks,
>>               > David
>>               >
>>               >> thanks!
>>               >> Coleen
>>               >>>
>>               >>> Thanks,
>>               >>> David
>>               >>>
>>               >>>
>>               >>>
>>               >>>> We could file an RFE to either implement SafeFetch
>>         for Zero or
>>              rewrite
>>               >>>> this objectMonitor code to not need SafeFetch.  I
>>         didn't want
>>              to do
>>               >>>> either of these things with this patch.
>>               >>>>
>>               >>>> Coleen
>>               >>>>
>>               >>>>>
>>               >>>>> Thanks,
>>               >>>>> David
>>               >>>>>
>>               >>>>> On 26/03/2015 3:53 AM, Coleen Phillimore wrote:
>>               >>>>>> Summary: Implement SafeFetchX unsafely and make
>>         CanUseSafeFetchX
>>               >>>>>> false
>>               >>>>>> for Zero
>>               >>>>>>
>>               >>>>>> Also, this fixes a couple other minor issues.
>>               >>>>>>
>>               >>>>>> Ran jdk9 jck tests with one timeout.
>>         hotspot/runtime jtreg
>>              tests
>>               >>>>>> don't
>>               >>>>>> run because Zero doesn't support UseCompressedOops
>>         (not sure
>>              why)
>>               >>>>>> and
>>               >>>>>> CDS (know why).
>>               >>>>>>
>>               >>>>>> open webrev at
>>         http://cr.openjdk.java.net/~__coleenp/8075967/
>>         <http://cr.openjdk.java.net/~coleenp/8075967/>
>>               >>>>>> bug link
>>         https://bugs.openjdk.java.net/__browse/JDK-8075967
>>         <https://bugs.openjdk.java.net/browse/JDK-8075967>
>>               >>>>>>
>>               >>>>>> Thanks,
>>               >>>>>> Coleen
>>               >>>>
>>               >>
>>
>>
>>
>>