RFR(XL): 8185640: Thread-local handshakes

Tue Nov 7 16:51:18 UTC 2017

On 11/7/17 10:08 AM, Doerr, Martin wrote:
> Hi Coleen,
>
> my point was not related to deoptimization. Maybe I was not clear enough. It could happen with -Xint (no deoptimization at all).
> 1. Java thread executes register_finalizer
> 2. Safepoint poll in Bytecodes::_return_register_finalizer bytecode gets hit
> 3. VM reexecutes the bytecode after safepoint (bcp still points to the Bytecodes::_return_register_finalizer bytecode)

This is my confusion then.  Why would the 
TemplateTable::_return_register_finalizer bytecode be reexecuted because 
of the safepoint if not for deoptimization?   I expect it to continue at 
the point in the template after the call_VM and not reregister the 
finalizer.

>
> Btw. the compilers generate the poll after popping the frame, so doing it after remove_activation would be closer to that. But I don't want to propose this, either. I share the opinion that this may be messy.

Agree.
Coleen
>
> Best regards,
> Martin
>
>
> -----Original Message-----
> From: coleen.phillimore at oracle.com [mailto:coleen.phillimore at oracle.com]
> Sent: Dienstag, 7. November 2017 15:54
> To: Doerr, Martin <martin.doerr at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
> Cc: Nils Eliasson <nils.eliasson at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Andrew Haley <aph at redhat.com>; David Holmes <david.holmes at oracle.com>
> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>
>
>
> On 11/7/17 9:04 AM, Doerr, Martin wrote:
>> Hi Coleen,
>>
>>> The TemplateTable::_return_register_finalizer bytecode wouldn't get run
>>> twice with deoptimization because
>>> TemplateInterpreter::deopt_reexecute_entry() will reexecute the
>>> _return_register_finalizer as a _return(vtos) bytecode.   Very tricky
>>> indeed
>> Probably not with deoptimization, but what about the following scenario:
>> 1. Java thread executes register_finalizer
>> 2. Safepoint poll in Bytecodes::_return_register_finalizer bytecode gets hit
>> 3. VM reexecutes the bytecode after safepoint (bcp still points to the Bytecodes::_return_register_finalizer bytecode)
>>
>> This doesn't sound good to me.
> The deoptimization only happens when the compiled frame is on the top of
> the stack, not TemplateTable::_return(), ie the interpreter, and does
> not deoptimize for the safepoint poll in the compiled frame.
>
> See frame::should_be_deoptimized, and can_be_deoptimized (not sure the
> difference between the two).
>
> If the deopt happens for the return_register_finalizer call, it reruns
> the bytecode in the interpreter as Bytecodes::_return(vtos) rather than
> _return_register_finalizer, so that the registration doesn't happen
> twice.  See TemplateInterpreter::deopt_reexecute_entry().
>
> This is what I've learned in the past couple of days!   It's very
> subtle, and Dean filed an RFE
> https://bugs.openjdk.java.net/browse/JDK-8190817 to hopefully help make
> this clearer.
>
>>    
>>> I still would like to see the safepoint poll in TemplateTable::_return
>>> after the call to return_register_finalizer since that's the order that
>>> the compiled code does it.
>> So that means, you'd expect the poll to be done after remove_activation. I think this may be more tricky to implement because we don't know where we return to (not necessarily an interpreted method). Right?
> Sorry, no, I'd like the polll to be after return_register_finalizer call
> in TemplateTable::_return.  remove_activation is a messy thing.
>
> Thanks,
> Coleen
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: coleen.phillimore at oracle.com [mailto:coleen.phillimore at oracle.com]
>> Sent: Dienstag, 7. November 2017 14:46
>> To: Doerr, Martin <martin.doerr at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; hotspot-dev developers <hotspot-dev at openjdk.java.net>
>> Cc: Nils Eliasson <nils.eliasson at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Andrew Haley <aph at redhat.com>; David Holmes <david.holmes at oracle.com>
>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>
>>
>>
>> On 11/7/17 7:04 AM, Doerr, Martin wrote:
>>> Hi Robbin,
>>>
>>> first of all, sorry that my proposal caused much more work on your side than expected. I still appreciate that you're doing it. Thanks.
>>>
>>> I was not aware of this checking code on x86 in debug build. But it looks feasible to me to move the safepoint check after the last_sp handling.
>>>
>>> However, it looks like register_finalizer could get executed twice with the new implementation. So I'd prefer to generate the safepoint poll code only for the normal return bytecodes:
>>> if (SafepointMechanism::uses_thread_local_poll() && _desc->bytecode() != Bytecodes::_return_register_finalizer)
>>>
>>> Bytecodes::_return_register_finalizer is only used for returns from Object.<init> constructor. I think we can live without safepoint poll there. I don't need to see a new webrev for this.
>> The TemplateTable::_return_register_finalizer bytecode wouldn't get run
>> twice with deoptimization because
>> TemplateInterpreter::deopt_reexecute_entry() will reexecute the
>> _return_register_finalizer as a _return(vtos) bytecode.   Very tricky
>> indeed.
>>
>> Your suggested change would work also because call_VM for
>> return_register_finalizer will do a safepoint check on the JRT_END
>> transition, so there'd be only one check here, which is probably fine.
>> I still would like to see the safepoint poll in TemplateTable::_return
>> after the call to return_register_finalizer since that's the order that
>> the compiled code does it.
>>
>> thanks,
>> Coleen
>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Dienstag, 7. November 2017 11:59
>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>> Cc: Doerr, Martin <martin.doerr at sap.com>; Nils Eliasson <nils.eliasson at oracle.com>; Karen Kinnear <karen.kinnear at oracle.com>; Andrew Haley <aph at redhat.com>; coleen.phillimore at oracle.com; David Holmes <david.holmes at oracle.com>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi all,
>>>
>>> First a bug have been found, when deopt happens in return, the return is
>>> re-executed and we hit an assert in the call_VM because last_sp is now not NULL.
>>> After some discussion the proposed solution is to move the poll after the
>>> explicit reset of last_sp. (Re-execution is always vtos.) This is fixed in #15
>>> changeset, #14 just some copyright year updates.
>>> (#9 changeset was dropped before it went on RFR, so it's not listed)
>>>
>>> The JEP will be targeted to JDK 10 Friday and integration will happen shortly
>>> after. For completeness all inc and a full (rebased on jdk/hs), all on CC I'm
>>> adding as reviewers.
>>>
>>> The code will be committed on: "8189941: Implementation JEP 312: Thread-local
>>> handshake"
>>>
>>> Tested tier 1-5, jprt, all tonga.
>>>
>>> Martin can you have quick look at #15 changeset?
>>>
>>> Thanks, Robbin
>>>
>>> SafepointMechanism-0
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/SafepointMechanism-0/webrev/
>>>
>>> PollingPage-1
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/PollingPage-1/webrev/
>>>
>>> Handshakes-2
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Handshakes-2/webrev/
>>>
>>> Atomic-Update-Rebase-3
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Atomic-Update-Rebase-3/webrev/
>>>
>>> Coleen-n-Test-Cleanup-4
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Coleen-n-Test-Cleanup-4/webrev/
>>>
>>> Assorted-Karen-5
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Assorted-Karen-5/webrev/
>>>
>>> Support-Check-Haley-6
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Support-Check-Haley-6/webrev/
>>>
>>> Interpreter-Poll-7
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-7/webrev/
>>>
>>> Interpreter-Poll-Wide_Ret-8
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-Wide_Ret-8/webrev/
>>>
>>> Interpreter-Poll-Switch-10
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-Switch-10/webrev/
>>>
>>> Interpreter-Poll-Ret-11
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-Ret-11/webrev/
>>>
>>> Option-Cleanup-12
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Option-Cleanup-12/webrev/
>>>
>>> DavidH-Option-Cleanup-13
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/DavidH-Option-Cleanup-13/webrev/
>>>
>>> Copyright-Update-14
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Copyright-Update-14/webrev/
>>>
>>> Interpreter-Poll-Ret-Deopt-Fix-15
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Interpreter-Poll-Ret-Deopt-Fix-15/webrev/
>>>
>>> Full
>>> http://cr.openjdk.java.net/~rehn/8185640/v10/Full/webrev/
>>>
>>> On 10/11/2017 03:37 PM, Robbin Ehn wrote:
>>>> Hi all,
>>>>
>>>> Starting the review of the code while JEP work is still not completed.
>>>>
>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>>
>>>> This JEP introduces a way to execute a callback on threads without performing a
>>>> global VM safepoint. It makes it both possible and cheap to stop individual
>>>> threads and not just all threads or none.
>>>>
>>>> Entire changeset:
>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>>
>>>> Divided into 3-parts,
>>>> SafepointMechanism abstraction:
>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>> Consolidating polling page allocation:
>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>> Handshakes:
>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>>
>>>> A handshake operation is a callback that is executed for each JavaThread while
>>>> that thread is in a safepoint safe state. The callback is executed either by the
>>>> thread itself or by the VM thread while keeping the thread in a blocked state.
>>>> The big difference between safepointing and handshaking is that the per thread
>>>> operation will be performed on all threads as soon as possible and they will
>>>> continue to execute as soon as it’s own operation is completed. If a JavaThread
>>>> is known to be running, then a handshake can be performed with that single
>>>> JavaThread as well.
>>>>
>>>> The current safepointing scheme is modified to perform an indirection through a
>>>> per-thread pointer which will allow a single thread's execution to be forced to
>>>> trap on the guard page. In order to force a thread to yield the VM updates the
>>>> per-thread pointer for the corresponding thread to point to the guarded page.
>>>>
>>>> Example of potential use-cases:
>>>> -Biased lock revocation
>>>> -External requests for stack traces
>>>> -Deoptimization
>>>> -Async exception delivery
>>>> -External suspension
>>>> -Eliding memory barriers
>>>>
>>>> All of these will benefit the VM moving towards becoming more low-latency
>>>> friendly by reducing the number of global safepoints.
>>>> Platforms that do not yet implement the per JavaThread poll, a fallback to
>>>> normal safepoint is in place. HandshakeOneThread will then be a normal
>>>> safepoint. The supported platforms are Linux x64 and Solaris SPARC.
>>>>
>>>> Tested heavily with various test suits and comes with a few new tests.
>>>>
>>>> Performance testing using standardized benchmark show no signification changes,
>>>> the latest number was -0.7% on Linux x64 and +1.5% Solaris SPARC (not
>>>> statistically ensured). A minor regression for the load vs load load on x64 is
>>>> expected and a slight increase on SPARC due to the cost of ‘materializing’ the
>>>> page vs load load.
>>>> The time to trigger a safepoint was measured on a large machine to not be an
>>>> issue. The looping over threads and arming the polling page will benefit from
>>>> the work on JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle:
>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html)
>>>> which puts all JavaThreads in an array instead of a linked list.
>>>>
>>>> Thanks, Robbin