RFR (xs) 8075967: Zero interpreter asserts for SafeFetch<32, N> calls in ObjectMonitor

Thu Mar 26 12:24:18 UTC 2015

Hi David,

On Thu, Mar 26, 2015 at 11:42 AM, David Holmes <david.holmes at oracle.com>
wrote:

> Hi Thomas,
>
> On 26/03/2015 6:35 PM, Thomas Stüfe wrote:
>
>> Hi,
>>
>> I agree with Goetz.
>>
>> SafeFetch() was meant to be guarded with CanUseSafeFetch(), this is part
>> of the contract, so using it without this guard is actually incorrect
>> and would crash were the calling code ever called before Stub routine
>> initialization.
>>
>
> That contract was already implicit - this API is for use after the stubs
> have been generated. There's no reason to expand the use of this API to
> situations before stubs are generated. So the assert captures that and
> CanUseSafeFetch is really unnecessary.

We had to introduce CanUseSafeFetch() exactly because we have situations
where we may call SafeFetch before stub routine generation:

1) In os_aix.cpp, we use it to probe protected memory after doing
mprotect(). This is a long story, but we had to add this workaround because
on shortcomings in AIX's mprotect(). This code will get called during
initialization before stub routines are set up.
Note that right now, in the OpenJDK code base, the code still directly
checks whether the stubroutine is NULL, because that mprotect workaround is
older than 8074552. But I planned to change this to CanUseSafeFetch(). But
with the new CanUseSafeFetch, this would just break AIX.

2) In SAPs code base, we also use it to harden error log writing (e.g. to
make callstack dumping more reliable). This may can be invoked very early,
if a crash happens very early, before stub routines are available. We
planned to contribute those improvements piece by piece to the OpenJDK - my
first change for 8074552 was actually supposed to prepare that. Now,
however, this makes no sense, because using SafeFetch() to harden error log
writing would actually be worse for error logs for early crashes.

SafeFetch is a very basic function which could be useful in many more
contexts, but only if you can really be sure it is safe. The contexts it is
used in may be very low level where you cannot be sure at which point in VM
initialization you are.

When I introduced CanUseSafeFetch(), it was to safeguard the use of
SafeFetch() for situations where SafeFetch was not safe to call. Exactly so
that I would have not to think about "is this too early to use
SafeFetch()?". This is broken now - instead of guarding me against using
SafeFetch() in situations where it is not safe to use, it will
assert(debug) or potentially lie(release), in which case SafeFetch itself
may crash.

>
>  So now
>> - CanUseSafeFetch really now means "SafeFetch will behave as excpected"
>> - SafeFetch is softened to "will safely fetch content of address unless
>> on zero, there it may crash the VM".
>>
>> I think this is more confusing than before. In objectMonitor, the logic
>> we want is "use SafeFetch wherever possible but if not, just access the
>> memory directly". This logic now is hidden behind a seemingly incorrect
>> use of SafeFetch without CanUseSafeFetch, and I think Goetz proposal
>> would have been far more clearer.
>>
>
> No the problem is that SafeFetch doesn't work on ZERO - if you give it a
> bad address it won't recover it will crash. The whole point is for it not
> to crash. The objectMonitor code is saying "here is a load that we know may
> be invalid, but if it is then we don't want to crash, so we use SafeFetch".

Yes! But the way it is coded, it does not. Because SafeFetch is now
implemented on zero with a simple load, it is not safe at all. The code in
objectMonitor.cpp appears to be safe, but it is not. And SafeFetch()
promises to be safe, but it is not, and you would not know unless you know
the zero implementation.

It seems to be a bit inconsequent. Either one is really sure the pointer is
accessible - in which case why use SafeFetch? - or it may be invalid, even
if this is unusual - in which case why is the resulting crash not a problem
for zero?

In other words, if the original programmer thought it worth to guard the
memory access with SafeFetch(), why is this not a concern for zero?

The expectation is that SafeFetch actually is safe no matter what is
> fetched.

Yes, and that expectation is broken now.

Before, if you used SafeFetch() and guarded it with CanUseSafeFetch(), you
could be sure to be safe and not to break anything - regardless when in VM
life you used the function or on which platform.

Both SafeFetch and CanUseSafeFetch made promises:
- CanUseSafeFetch promised to tell me if it is too early to call SafeFetch.
Of course, it itself could be called at all times, that was the whole point
of CanUseSafeFetch.
- SafeFetch promised to be able to access a potentially invalid pointer and
not to crash - if CanUseSafeFetch returned true.

Both promises are broken now. CanUseSafeFetch is not safe to call if it is
too early - ironically a condition it was designed to tell me. SafeFetch is
not safe to call because it lies about being safe for zero.

The result is that either if we continue to use SafeFetch, we risk crashes
on zero and crashes if invoked too early. This makes the API useless for
most cases.

I am open to any other suggestions - renaming the APIs, maybe reshaping
them - but I regret seeing SafeFetch being back to being quite unsafe again.

Kind Regards, Thomas

The ZERO implementation just does a load and hopes it isn't asked for
> anything bad. The objectMonitor code will rarely actually trigger a bad
> load. But the newly introduced test does.
>
> So in my opinion to return to the status-quo where ZERO really doesn't
> implement SafeFetch safely but it "works okay in practice", we should
> disable those explicit tests when running on ZERO.
>
> Whether CanUseSafeFetch (aka is SafeFetchStubAvailable) is implemented
> explicitly or implicitly as an assert is a side-issue.
>
> David
>
>  Kind Regards, Thomas
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Mar 26, 2015 at 8:40 AM, Lindenmaier, Goetz
>> <goetz.lindenmaier at sap.com <mailto:goetz.lindenmaier at sap.com>> wrote:
>>
>>     Hi,
>>
>>     sorry I only comment on this now ...
>>     I think it's rather strange that you can not use SafeFetch on zero,
>>     but it's there,
>>     and it's actually used ... which will be the situation after this
>>     change.
>>
>>     Why not just change objectMonitor as the SafeFetch api is meant to
>>     be used?
>>     Obviously zero takes care that this access always succeeds:
>>
>>     diff -r a35854c84d9d src/share/vm/runtime/objectMonitor.cpp
>>     --- a/src/share/vm/runtime/objectMonitor.cpp    Wed Mar 25 12:36:28
>>     2015 +0100
>>     +++ b/src/share/vm/runtime/objectMonitor.cpp    Thu Mar 26 08:19:53
>>     2015 +0100
>>     @@ -2241,7 +2241,12 @@
>>         }
>>
>>         assert(sizeof(((JavaThread *)ox)->_thread_state == sizeof(int)),
>>     "invariant");
>>     -  int jst = SafeFetch32((int *) &((JavaThread *)
>>     ox)->_thread_state, -1);;
>>     +  int jst;
>>     +  if (CanUseSafeFetch32()) {
>>     +    jst = SafeFetch32((int *) &((JavaThread *) ox)->_thread_state,
>> -1);
>>     +  } else {
>>     +    jst = ((JavaThread *)ox)->_thread_state;
>>     +  }
>>         // consider also: jst != _thread_in_Java -- but that's
>> overspecific.
>>         return jst == _thread_blocked || jst == _thread_in_native;
>>       }
>>
>>     It's handled similarly in vmError.cpp.  And no #ifdefs ;)
>>
>>     But I'm also fine with the other change ... to get zero working again.
>>
>>     Best regards,
>>        Goetz.
>>
>>
>>
>>     -----Original Message-----
>>     From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net
>>     <mailto:hotspot-dev-bounces at openjdk.java.net>] On Behalf Of Coleen
>>     Phillimore
>>     Sent: Thursday, March 26, 2015 3:30 AM
>>     To: David Holmes; hotspot-dev developers
>>     Subject: Re: RFR (xs) 8075967: Zero interpreter asserts for
>>     SafeFetch<32, N> calls in ObjectMonitor
>>
>>
>>     On 3/25/15, 10:15 PM, David Holmes wrote:
>>      > On 26/03/2015 11:00 AM, Coleen Phillimore wrote:
>>      >>
>>      >> On 3/25/15, 8:21 PM, David Holmes wrote:
>>      >>> On 26/03/2015 10:04 AM, Coleen Phillimore wrote:
>>      >>>>
>>      >>>> On 3/25/15, 7:59 PM, David Holmes wrote:
>>      >>>>> Hi Coleen,
>>      >>>>>
>>      >>>>> Why generate stubs that can't be used and then add
>> zero-specific
>>      >>>>> logic
>>      >>>>> to the shared CanUseSafeFetchN instead of not generating the
>>     stubs so
>>      >>>>> that CanUseSafeFetchN will return false anyway ??
>>      >>>>
>>      >>>> Because there is platform independent code in
>>     objectMonitor.cpp that
>>      >>>> uses SafeFetchX (both).  I'd rather not burden the caller of
>>     this with
>>      >>>> testing CanUseSafeFetchX for each call.   This code existed
>> before
>>      >>>> SafeFetch was implemented, so I'm restoring previous behavior
>> for
>>      >>>> Zero.
>>      >>>
>>      >>> Sorry Coleen I thought this was deal to with 8075533, I hadn't
>>      >>> realized 8075533 broke the objectMonitor code and this was a
>>     follow up.
>>      >>>
>>      >>> What a mess. :(
>>      >>
>>      >> Yes, it is.  Is this a code Review?
>>      >
>>      > Let me work through this :) The original change for 8074552:
>>      >
>>      > http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/3eb61269f421
>>      >
>>      > added the extra error handling stuff related to SafeFetch and a new
>>      > ErrorHandler test that was predicated on CanUseSafeFetch32 which
>>      > returns true if the stubs exist. It also added testing of
>>     SafeFetch in
>>      > StubRoutines::initialize2 which was excluded for windows-32-bit and
>>      > which did not check CanUseSafeFetch32.
>>      >
>>      > That change broke zero as reported in 8075533 because you can't
>>      > actually use the SafeFetch routines with an unsafe address on Zero,
>>      > and the solution:
>>      >
>>      > http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/5c2bc6278fc4
>>      >
>>      > was to remove the stubs so that CanUseSafeFetch32 would return
>> false,
>>      > and to include a check of CanUseSafeFetch32 in the tests used with
>>      > StubRoutines::initialize2.
>>      >
>>      > And now we find that the SafeFetch routines are used
>> unconditionally
>>      > by the objectMonitor code, so we need to restore the stubs that
>> were
>>      > previously removed, but force CanUseSafeFetch32 to return false
>>     on zero.
>>
>>     Yes, exactly.
>>      >
>>      > Ok. Only nit is that:
>>      >
>>      > return NOT_ZERO(true) ZERO_ONLY(false);
>>
>>     Yes, you're right.  I'll make that change.
>>
>>     Thanks!
>>     Coleen
>>
>>      >
>>      > would be more readable than the ifdefs.
>>      >
>>      > Thanks,
>>      > David
>>      >
>>      >> thanks!
>>      >> Coleen
>>      >>>
>>      >>> Thanks,
>>      >>> David
>>      >>>
>>      >>>
>>      >>>
>>      >>>> We could file an RFE to either implement SafeFetch for Zero or
>>     rewrite
>>      >>>> this objectMonitor code to not need SafeFetch.  I didn't want
>>     to do
>>      >>>> either of these things with this patch.
>>      >>>>
>>      >>>> Coleen
>>      >>>>
>>      >>>>>
>>      >>>>> Thanks,
>>      >>>>> David
>>      >>>>>
>>      >>>>> On 26/03/2015 3:53 AM, Coleen Phillimore wrote:
>>      >>>>>> Summary: Implement SafeFetchX unsafely and make
>> CanUseSafeFetchX
>>      >>>>>> false
>>      >>>>>> for Zero
>>      >>>>>>
>>      >>>>>> Also, this fixes a couple other minor issues.
>>      >>>>>>
>>      >>>>>> Ran jdk9 jck tests with one timeout.  hotspot/runtime jtreg
>>     tests
>>      >>>>>> don't
>>      >>>>>> run because Zero doesn't support UseCompressedOops (not sure
>>     why)
>>      >>>>>> and
>>      >>>>>> CDS (know why).
>>      >>>>>>
>>      >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8075967/
>>      >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8075967
>>      >>>>>>
>>      >>>>>> Thanks,
>>      >>>>>> Coleen
>>      >>>>
>>      >>
>>
>>
>>