RFR (xs) 8075967: Zero interpreter asserts for SafeFetch<32, N> calls in ObjectMonitor
Thomas Stüfe
thomas.stuefe at gmail.com
Fri Mar 27 06:46:34 UTC 2015
Colleen, David,
can we all agree an Colleens proposal:
inline bool CanUseSafeFetch32() {
return NOT_ZERO(StubRoutines::SafeFetch32_stub() != NULL)
ZERO_ONLY(false);
}
?
If that is ok, I make a patch for that.
As for objectMonitor:
>> So I thought of changing the objectMonitor code to do
>> if (CanUseSafeFetchN) {
>> SafeFetchN(...)
>> } else {
>> just fetch unsafely
>> }
>>
>
> This serves no purpose and adds overheads in the monitor code. We don't
> need the "have the stubs been generated" check for this code. And the "can
> I use ZERO" check would be pointless: if safefetch on zero doesn't work
> then SafeFetch may crash; if you take the else then it may also crash -
> same result both code paths, so no point.
>
>
But this documents the intent clearly. We want to access the memory safely,
unless we cannot do SafeFetch, then we risk a direct load.
That said, I have not many emotions here. I think in the long run SafeFetch
can/will be implemented on zero.
Kind Regards, Thomas
David
>
> There's also a SafeFetch32 call around that same place that would also
>> have to be changed this way.
>>
>> But I thought it was messy to add this conditional code since the result
>> for Zero would be the exactly the same in this place. So I chose
>> CanUseSafeFetch to mean SafeFetch isn't safe and protected code the test
>> code that way and not the ObjectMonitor code.
>>
>> I don't mind if this is reversed as long as it's not messy and is tested.
>>
>> Coleen
>>
>> On 3/26/15, 9:46 AM, Thomas Stüfe wrote:
>>
>>> Hi David,
>>>
>>> you are right, it was also broken before. I missed that Coleen just
>>> reintroduced an earlier version of the dummy SafeFetch()
>>> implementation, sorry for that.
>>>
>>> I would be happy with a CanUseSafeFetch() which returns always false
>>> on zero and false on other platforms before stub generation. That
>>> would be always safe to call.
>>>
>>> Kind Regards, Thomas
>>>
>>>
>>> On Thu, Mar 26, 2015 at 1:38 PM, David Holmes <david.holmes at oracle.com
>>> <mailto:david.holmes at oracle.com>> wrote:
>>>
>>> On 26/03/2015 10:24 PM, Thomas Stüfe wrote:
>>>
>>> Hi David,
>>>
>>> On Thu, Mar 26, 2015 at 11:42 AM, David Holmes
>>> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>
>>> <mailto:david.holmes at oracle.com
>>>
>>> <mailto:david.holmes at oracle.com>>> wrote:
>>>
>>> Hi Thomas,
>>>
>>> On 26/03/2015 6:35 PM, Thomas Stüfe wrote:
>>>
>>> Hi,
>>>
>>> I agree with Goetz.
>>>
>>> SafeFetch() was meant to be guarded with
>>> CanUseSafeFetch(), this
>>> is part
>>> of the contract, so using it without this guard is
>>> actually
>>> incorrect
>>> and would crash were the calling code ever called
>>> before Stub
>>> routine
>>> initialization.
>>>
>>>
>>> That contract was already implicit - this API is for use
>>> after the
>>> stubs have been generated. There's no reason to expand the
>>> use of
>>> this API to situations before stubs are generated. So the
>>> assert
>>> captures that and CanUseSafeFetch is really unnecessary.
>>>
>>>
>>> We had to introduce CanUseSafeFetch() exactly because we have
>>> situations
>>> where we may call SafeFetch before stub routine generation:
>>>
>>> 1) In os_aix.cpp, we use it to probe protected memory after doing
>>> mprotect(). This is a long story, but we had to add this
>>> workaround
>>> because on shortcomings in AIX's mprotect(). This code will
>>> get called
>>> during initialization before stub routines are set up.
>>> Note that right now, in the OpenJDK code base, the code still
>>> directly
>>> checks whether the stubroutine is NULL, because that mprotect
>>> workaround
>>> is older than 8074552. But I planned to change this to
>>> CanUseSafeFetch(). But with the new CanUseSafeFetch, this
>>> would just
>>> break AIX.
>>>
>>> 2) In SAPs code base, we also use it to harden error log
>>> writing (e.g.
>>> to make callstack dumping more reliable). This may can be
>>> invoked very
>>> early, if a crash happens very early, before stub routines are
>>> available. We planned to contribute those improvements piece
>>> by piece to
>>> the OpenJDK - my first change for 8074552 was actually supposed
>>> to
>>> prepare that. Now, however, this makes no sense, because using
>>> SafeFetch() to harden error log writing would actually be
>>> worse for
>>> error logs for early crashes.
>>>
>>> SafeFetch is a very basic function which could be useful in
>>> many more
>>> contexts, but only if you can really be sure it is safe. The
>>> contexts it
>>> is used in may be very low level where you cannot be sure at
>>> which point
>>> in VM initialization you are.
>>>
>>> When I introduced CanUseSafeFetch(), it was to safeguard the
>>> use of
>>> SafeFetch() for situations where SafeFetch was not safe to
>>> call. Exactly
>>> so that I would have not to think about "is this too early to use
>>> SafeFetch()?". This is broken now - instead of guarding me
>>> against using
>>> SafeFetch() in situations where it is not safe to use, it will
>>> assert(debug) or potentially lie(release), in which case
>>> SafeFetch
>>> itself may crash.
>>>
>>>
>>> Okay I see a case for adding a check for "can I use SafeFetch at
>>> this point in time" (though if what you are fetching is
>>> potentially not-safe then you are in trouble!).
>>>
>>> So now
>>> - CanUseSafeFetch really now means "SafeFetch will
>>> behave as
>>> excpected"
>>> - SafeFetch is softened to "will safely fetch content
>>> of address
>>> unless on zero, there it may crash the VM".
>>>
>>> I think this is more confusing than before. In
>>> objectMonitor,
>>> the logic
>>> we want is "use SafeFetch wherever possible but if
>>> not, just
>>> access the
>>> memory directly". This logic now is hidden behind a
>>> seemingly
>>> incorrect
>>> use of SafeFetch without CanUseSafeFetch, and I think
>>> Goetz proposal
>>> would have been far more clearer.
>>>
>>>
>>> No the problem is that SafeFetch doesn't work on ZERO - if
>>> you give
>>> it a bad address it won't recover it will crash. The whole
>>> point is
>>> for it not to crash. The objectMonitor code is saying
>>> "here is a
>>> load that we know may be invalid, but if it is then we
>>> don't want to
>>> crash, so we use SafeFetch".
>>>
>>>
>>> Yes! But the way it is coded, it does not. Because SafeFetch
>>> is now
>>> implemented on zero with a simple load, it is not safe at all.
>>> The code
>>> in objectMonitor.cpp appears to be safe, but it is not. And
>>> SafeFetch()
>>> promises to be safe, but it is not, and you would not know
>>> unless you
>>> know the zero implementation.
>>>
>>>
>>> Sorry did I miss something - has Zero ever implemented SafeFetch
>>> as other than a direct simple load? ie hasn't Zero always been
>>> broken wrt SafeFetch and that is the basic problem ?
>>>
>>> It seems to be a bit inconsequent. Either one is really sure
>>> the pointer
>>> is accessible - in which case why use SafeFetch? - or it may
>>> be invalid,
>>> even if this is unusual - in which case why is the resulting
>>> crash not a
>>> problem for zero?
>>> In other words, if the original programmer thought it worth to
>>> guard the
>>> memory access with SafeFetch(), why is this not a concern for
>>> zero?
>>>
>>>
>>> It is - which is why Zero is broken.
>>>
>>> The expectation is that SafeFetch actually is safe no
>>> matter what is
>>> fetched.
>>>
>>>
>>> Yes, and that expectation is broken now.
>>>
>>>
>>> When was it not broken?
>>>
>>> Before, if you used SafeFetch() and guarded it with
>>> CanUseSafeFetch(),
>>> you could be sure to be safe and not to break anything -
>>> regardless when
>>> in VM life you used the function or on which platform.
>>>
>>>
>>> No you only knew that a function called SafeFetch could be called
>>> - you didn't know it was actually implemented to actually always
>>> do a "safe fetch".
>>>
>>> Both SafeFetch and CanUseSafeFetch made promises:
>>> - CanUseSafeFetch promised to tell me if it is too early to call
>>> SafeFetch. Of course, it itself could be called at all times,
>>> that was
>>> the whole point of CanUseSafeFetch.
>>> - SafeFetch promised to be able to access a potentially
>>> invalid pointer
>>> and not to crash - if CanUseSafeFetch returned true.
>>>
>>>
>>> SafeFetch (if it exists) promises to be safe. The ZERO
>>> implementation breaks that.
>>>
>>> Both promises are broken now. CanUseSafeFetch is not safe to
>>> call if it
>>> is too early - ironically a condition it was designed to tell me.
>>>
>>>
>>> Okay we will re-fix that.
>>>
>>> SafeFetch is not safe to call because it lies about being safe
>>> for zero.
>>>
>>>
>>> It never was safe to call as far as I can see, it was just never
>>> called with an invalid value. The new test exposed this.
>>>
>>> David
>>> -----
>>>
>>> The result is that either if we continue to use SafeFetch, we
>>> risk
>>> crashes on zero and crashes if invoked too early. This makes
>>> the API
>>> useless for most cases.
>>>
>>> I am open to any other suggestions - renaming the APIs, maybe
>>> reshaping
>>> them - but I regret seeing SafeFetch being back to being quite
>>> unsafe again.
>>>
>>> Kind Regards, Thomas
>>>
>>>
>>> The ZERO implementation just does a load and hopes it
>>> isn't asked
>>> for anything bad. The objectMonitor code will rarely actually
>>> trigger a bad load. But the newly introduced test does.
>>>
>>> So in my opinion to return to the status-quo where ZERO
>>> really
>>> doesn't implement SafeFetch safely but it "works okay in
>>> practice",
>>> we should disable those explicit tests when running on ZERO.
>>>
>>> Whether CanUseSafeFetch (aka is SafeFetchStubAvailable) is
>>> implemented explicitly or implicitly as an assert is a
>>> side-issue.
>>>
>>> David
>>>
>>> Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Mar 26, 2015 at 8:40 AM, Lindenmaier, Goetz
>>> <goetz.lindenmaier at sap.com
>>> <mailto:goetz.lindenmaier at sap.com>
>>> <mailto:goetz.lindenmaier at sap.com
>>> <mailto:goetz.lindenmaier at sap.com>>
>>> <mailto:goetz.lindenmaier at sap.
>>> <mailto:goetz.lindenmaier at sap.>__com
>>> <mailto:goetz.lindenmaier at sap.com
>>>
>>> <mailto:goetz.lindenmaier at sap.com>>>> wrote:
>>>
>>> Hi,
>>>
>>> sorry I only comment on this now ...
>>> I think it's rather strange that you can not use
>>> SafeFetch
>>> on zero,
>>> but it's there,
>>> and it's actually used ... which will be the
>>> situation
>>> after this
>>> change.
>>>
>>> Why not just change objectMonitor as the
>>> SafeFetch api is
>>> meant to
>>> be used?
>>> Obviously zero takes care that this access always
>>> succeeds:
>>>
>>> diff -r a35854c84d9d
>>> src/share/vm/runtime/__objectMonitor.cpp
>>> --- a/src/share/vm/runtime/__objectMonitor.cpp
>>> Wed Mar
>>> 25 12:36:28
>>> 2015 +0100
>>> +++ b/src/share/vm/runtime/__objectMonitor.cpp
>>> Thu Mar
>>>
>>> 26 08:19:53
>>> 2015 +0100
>>> @@ -2241,7 +2241,12 @@
>>> }
>>>
>>> assert(sizeof(((JavaThread
>>> *)ox)->_thread_state ==
>>> sizeof(int)),
>>> "invariant");
>>> - int jst = SafeFetch32((int *) &((JavaThread *)
>>> ox)->_thread_state, -1);;
>>> + int jst;
>>> + if (CanUseSafeFetch32()) {
>>> + jst = SafeFetch32((int *) &((JavaThread *)
>>> ox)->_thread_state, -1);
>>> + } else {
>>> + jst = ((JavaThread *)ox)->_thread_state;
>>> + }
>>> // consider also: jst != _thread_in_Java --
>>> but that's
>>> overspecific.
>>> return jst == _thread_blocked || jst ==
>>> _thread_in_native;
>>> }
>>>
>>> It's handled similarly in vmError.cpp. And no
>>> #ifdefs ;)
>>>
>>> But I'm also fine with the other change ... to
>>> get zero
>>> working again.
>>>
>>> Best regards,
>>> Goetz.
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: hotspot-dev
>>> [mailto:hotspot-dev-bounces@
>>> <mailto:hotspot-dev-bounces@>__openjdk.java.net
>>> <http://openjdk.java.net>
>>> <mailto:hotspot-dev-bounces at openjdk.java.net
>>> <mailto:hotspot-dev-bounces at openjdk.java.net>>
>>> <mailto:hotspot-dev-bounces@
>>> <mailto:hotspot-dev-bounces@>__openjdk.java.net
>>> <http://openjdk.java.net>
>>> <mailto:hotspot-dev-bounces at openjdk.java.net
>>>
>>> <mailto:hotspot-dev-bounces at openjdk.java.net>>>] On Behalf Of
>>> Coleen
>>> Phillimore
>>> Sent: Thursday, March 26, 2015 3:30 AM
>>> To: David Holmes; hotspot-dev developers
>>> Subject: Re: RFR (xs) 8075967: Zero interpreter
>>> asserts for
>>> SafeFetch<32, N> calls in ObjectMonitor
>>>
>>>
>>> On 3/25/15, 10:15 PM, David Holmes wrote:
>>> > On 26/03/2015 11:00 AM, Coleen Phillimore wrote:
>>> >>
>>> >> On 3/25/15, 8:21 PM, David Holmes wrote:
>>> >>> On 26/03/2015 10:04 AM, Coleen Phillimore
>>> wrote:
>>> >>>>
>>> >>>> On 3/25/15, 7:59 PM, David Holmes wrote:
>>> >>>>> Hi Coleen,
>>> >>>>>
>>> >>>>> Why generate stubs that can't be used and
>>> then add
>>> zero-specific
>>> >>>>> logic
>>> >>>>> to the shared CanUseSafeFetchN instead of not
>>> generating the
>>> stubs so
>>> >>>>> that CanUseSafeFetchN will return false
>>> anyway ??
>>> >>>>
>>> >>>> Because there is platform independent code in
>>> objectMonitor.cpp that
>>> >>>> uses SafeFetchX (both). I'd rather not
>>> burden the
>>> caller of
>>> this with
>>> >>>> testing CanUseSafeFetchX for each call.
>>> This code
>>> existed before
>>> >>>> SafeFetch was implemented, so I'm restoring
>>> previous
>>> behavior for
>>> >>>> Zero.
>>> >>>
>>> >>> Sorry Coleen I thought this was deal to with
>>> 8075533,
>>> I hadn't
>>> >>> realized 8075533 broke the objectMonitor
>>> code and this
>>> was a
>>> follow up.
>>> >>>
>>> >>> What a mess. :(
>>> >>
>>> >> Yes, it is. Is this a code Review?
>>> >
>>> > Let me work through this :) The original
>>> change for 8074552:
>>> >
>>> >
>>> http://hg.openjdk.java.net/__jdk9/hs-rt/hotspot/rev/__
>>> 3eb61269f421
>>>
>>> <http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/3eb61269f421>
>>> >
>>> > added the extra error handling stuff related to
>>> SafeFetch and a new
>>> > ErrorHandler test that was predicated on
>>> CanUseSafeFetch32 which
>>> > returns true if the stubs exist. It also added
>>> testing of
>>> SafeFetch in
>>> > StubRoutines::initialize2 which was excluded for
>>> windows-32-bit and
>>> > which did not check CanUseSafeFetch32.
>>> >
>>> > That change broke zero as reported in 8075533
>>> because
>>> you can't
>>> > actually use the SafeFetch routines with an
>>> unsafe
>>> address on Zero,
>>> > and the solution:
>>> >
>>> >
>>> http://hg.openjdk.java.net/__jdk9/hs-rt/hotspot/rev/__
>>> 5c2bc6278fc4
>>>
>>>
>>>
>>> <http://hg.openjdk.java.net/jdk9/hs-rt/hotspot/rev/5c2bc6278fc4>
>>> >
>>> > was to remove the stubs so that
>>> CanUseSafeFetch32 would
>>> return false,
>>> > and to include a check of CanUseSafeFetch32 in
>>> the tests
>>> used with
>>> > StubRoutines::initialize2.
>>> >
>>> > And now we find that the SafeFetch routines
>>> are used
>>> unconditionally
>>> > by the objectMonitor code, so we need to
>>> restore the
>>> stubs that were
>>> > previously removed, but force CanUseSafeFetch32
>>> to
>>> return false
>>> on zero.
>>>
>>> Yes, exactly.
>>> >
>>> > Ok. Only nit is that:
>>> >
>>> > return NOT_ZERO(true) ZERO_ONLY(false);
>>>
>>> Yes, you're right. I'll make that change.
>>>
>>> Thanks!
>>> Coleen
>>>
>>> >
>>> > would be more readable than the ifdefs.
>>> >
>>> > Thanks,
>>> > David
>>> >
>>> >> thanks!
>>> >> Coleen
>>> >>>
>>> >>> Thanks,
>>> >>> David
>>> >>>
>>> >>>
>>> >>>
>>> >>>> We could file an RFE to either implement
>>> SafeFetch
>>> for Zero or
>>> rewrite
>>> >>>> this objectMonitor code to not need
>>> SafeFetch. I
>>> didn't want
>>> to do
>>> >>>> either of these things with this patch.
>>> >>>>
>>> >>>> Coleen
>>> >>>>
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> David
>>> >>>>>
>>> >>>>> On 26/03/2015 3:53 AM, Coleen Phillimore
>>> wrote:
>>> >>>>>> Summary: Implement SafeFetchX unsafely
>>> and make
>>> CanUseSafeFetchX
>>> >>>>>> false
>>> >>>>>> for Zero
>>> >>>>>>
>>> >>>>>> Also, this fixes a couple other minor
>>> issues.
>>> >>>>>>
>>> >>>>>> Ran jdk9 jck tests with one timeout.
>>> hotspot/runtime jtreg
>>> tests
>>> >>>>>> don't
>>> >>>>>> run because Zero doesn't support
>>> UseCompressedOops
>>> (not sure
>>> why)
>>> >>>>>> and
>>> >>>>>> CDS (know why).
>>> >>>>>>
>>> >>>>>> open webrev at
>>> http://cr.openjdk.java.net/~__coleenp/8075967/
>>> <http://cr.openjdk.java.net/%7E__coleenp/8075967/>
>>> <http://cr.openjdk.java.net/~coleenp/8075967/
>>> <http://cr.openjdk.java.net/%7Ecoleenp/8075967/>>
>>> >>>>>> bug link
>>> https://bugs.openjdk.java.net/__browse/JDK-8075967
>>> <https://bugs.openjdk.java.net/browse/JDK-8075967>
>>> >>>>>>
>>> >>>>>> Thanks,
>>> >>>>>> Coleen
>>> >>>>
>>> >>
>>>
>>>
>>>
>>>
>>>
>>
More information about the hotspot-dev
mailing list