RFR(L): 8226705: [REDO] Deoptimize with handshakes
Robbin Ehn
robbin.ehn at oracle.com
Thu Aug 29 12:25:45 UTC 2019
Hi Erik, thanks for having a look.
Just some short answers and questions.
On 8/29/19 12:01 PM, Erik Österlund wrote:
> Hi Robbin,
>
> Glad to see this work making it back again. Last time this patch was proposed,
> there were no guards against non-monotonic nmethod state transitions. Today
> there are (since https://bugs.openjdk.java.net/browse/JDK-8224674). In other
> words, if you call make_not_entrant() on an nmethod that has racingly become
> zombie, it is today impossible to resurrect that nmethod. Instead,
> make_not_entrant() will return false.
Great, I'll have a look.
>
> In particular, I think that is what the following code in the patch is guarding
> against (in codeCache.cpp):
>
> 1150 // Mark methods for deopt (if safe or possible).
> 1151 void CodeCache::mark_all_nmethods_for_deoptimization() {
> 1152 MutexLocker mu(CodeCache_lock, Mutex::_no_safepoint_check_flag);
> 1153 CompiledMethodIterator
> iter(CompiledMethodIterator::only_alive_and_not_unloading);
> 1154 while(iter.next()) {
> 1155 CompiledMethod* nm = iter.method();
> 1156 if (!nm->method()->is_method_handle_intrinsic() &&
> 1157 !nm->is_not_installed() &&
> 1158 nm->is_in_use() &&
> 1159 !nm->is_native_method()) {
> 1160 // Intrinsics and native methods are never deopted. A method that is
> 1161 // not installed yet or is not in use is not safe to deopt; the
> 1162 // is_in_use() check covers the not_entrant and not zombie cases.
> 1163 // Note: A not_entrant method can become a zombie at anytime if it was
> 1164 // made not_entrant before the previous safepoint/handshake.
> 1165 nm->mark_for_deoptimization();
> 1166 }
> 1167 }
> 1168 }
>
> In fact, this code is also guarding against is_not_installed nmethods (which was
> introduced as a new function). If we find that an nmethod is bad and say it
> needs deoptimization, and that nmethod is just about to get installed, it seems
> to me that skipping that nmethod and not making it not entrant is really
> dangerous and will eventually lead to a crash. Because in ciEnv.cpp we set_code
> to the about to be installed nmethod, and it will get called without anything
> stopping the invalid code from being executed by mutators, and it was being
> invalidated for a reason. So basically we really do have to make these not
> entrant, or we will blow up.
>
> I guess the reason you needed this check is because make_in_use() doesn't check
> what the current state is, causing monotonicity problems, making a potentially
> already not entrant nmethod become in_use, eventually blowing up asserts and
> probably the VM. Regrettably, make_in_use() still does that, because I had no
> need for changing it to use nmethod::try_transition in 8224674, nobody
> previously could find these nmethods. In retrospect, not changing that was a bit
> silly. So if we just change make_in_use() to use nmethod::try_transition
> instead, then you can remove the !is_not_installed check... which I argue we
> have to do. The reason we never ran in to this before is because we made the
> nmethods not entrant in a safepoint, and between making an nmethod and making it
> in_use, there is no safepoint check, so they were never observed in this state.
>
Yes, thread A is creating a new method X, it have state not installed.
If we make method X not entrant, Thread A will flip it back to in_use, which bad
and we crash. mark_all_nmethods_for_deoptimization is a test method used in the
new jtreg test, which is the only use (I see dtrace have some hack to use it, it
can crash the vm).
> ... and similarly here:
>
> 1192 if (nm->is_marked_for_deoptimization() && nm->is_in_use()) {
> 1193 // only_alive_and_not_unloading() can return not_entrant nmethods.
> 1194 // A not_entrant method can become a zombie at anytime if it was
> 1195 // made not_entrant before the previous safepoint/handshake. The
> 1196 // is_in_use() check covers the not_entrant and not zombie cases
> 1197 // that have become true after the method was marked for deopt.
> 1198 nm->make_not_entrant();
> 1199 }
>
> ... we now don't need the guard against is_in_use() here.
Here we don't need that, but we don't need !nm->is_not_entrant() either, since
it would return false as you say!?
If we should have an early filter it think nm->is_in_use() is clearer than:
not not entrant ;) (since the not_installed is normally not seen here)
>
> For what I propose to work out, we need to change nmethod::make_in_use to use
> nmethod::try_transition, and AOTCompiledMethod::make_entrant() needs to also
> return false when encountering a not_entrant nmethod, enforcing monotonicity,
> instead of asserting about that. AOT methods may turn not_in_use AOT methods to
> in_use, but not not_entrant to in_use. Once not_entrant, an AOT methods should
> remain not entrant forever.
The only time we see methods which are not installed should be methods that will
never be deoptimize (except when calling the test method
mark_all_nmethods_for_deoptimization), intrinsic and native.
And if I remember correctly make_in_use is called in a constructor which
requires extra error-handling. Construction would fail if try_transition failed.
The assumption about these 'never deopted' methods seem to be just that :)
>
> So basically, these guards are for something that used to be racy and dangerous
> due to the lack of guards inside of the state transitions, and today that is
> completely harmless, as the proper guards are in the attempts to change state. I
> would prefer to remove these guards, as it is confusing to the reader that we
> are guarding against a problem that can't happen. So I think you can remove the
> various checks for what state the nmethod is in, the comments describing races,
> and just keep the checks if it's a method handle intrinsic or native method.
I'll revisit this. Was a few months since I did this.
>
> In deoptimize.cpp, the fetch_unroll_info_helper function has moved down, making
> it difficult to review as shows the code delta as if everything changed. So I
> can't see what actually changed there. :( Would you mind moving that code back
> so the webrev shows what changed there?
I have inserted two new static functions before fetch_unroll_info_helper,
since code from fetch_unroll_info_helper are in these functions diff do this.
They are static function they are best off before, and putting after them I
don't think would help the diff tool + I need fwd decl.
Suggestion?
Thanks, Robbin
>
> Thanks,
> /Erik
>
> On 2019-08-28 11:57, Robbin Ehn wrote:
>> Hi, hotspot-dev was the intended list.
>>
>> Thanks, Robbin
>>
>> On 2019-08-28 09:30, Robbin Ehn wrote:
>>> Hi all, please consider,
>>>
>>> To get away from the issues of us revoking in the handshake, but before deopt
>>> the locks get re-biased before we hit deopt handler, we instead revoke in deopt
>>> handler.
>>> The deopt handler have a JRT_BLOCK/_END which can safepoint so we revoke after
>>> that but before we start looking at the monitors, with a NoSafepointVerifier.
>>>
>>> Here is the previous changeset on top of jdk/jdk tip but due to merge conflicts
>>> some pieces did not apply:
>>> http://cr.openjdk.java.net/~rehn/8226705/v1/8226705-8221734-baseline/webrev/
>>> So this is what was reviewed.
>>>
>>> The rebase (merge conflict resolutions) and 8224795 bug fix :
>>> http://cr.openjdk.java.net/~rehn/8226705/v1/8226705-rebase/webrev/
>>>
>>> Bug 8224795 fix is here:
>>> http://cr.openjdk.java.net/~rehn/8226705/v1/8226705-rebase/webrev/src/hotspot/share/code/nmethod.cpp.sdiff.html
>>>
>>>
>>> After this we have the same functionally as the reverted change-set.
>>>
>>> Here is the changes for doing the revoke in deopt handler instead:
>>> http://cr.openjdk.java.net/~rehn/8226705/v1/8226705-revoke-on-deopt/webrev/
>>>
>>> This code was messy adding the deopt revocation in 3 places made it worse.
>>> Here is a refactoring of that code. (also removed a dead method in
>>> biasedlocking):
>>> http://cr.openjdk.java.net/~rehn/8226705/v1/8226705-refactor/webrev/
>>>
>>> And here is full:
>>> http://cr.openjdk.java.net/~rehn/8226705/v1/full/webrev/
>>>
>>> Also a full squashed patch file:
>>> http://cr.openjdk.java.net/~rehn/8226705/v1/full/full.patch
>>>
>>> Latest test run I did t1-t6, Linux/Windows x64 have passed, MacOSX still
>>> running.
>>>
>>> Thanks, Robbin
>>>
>>>
More information about the hotspot-dev
mailing list