RFR(XL): 8185640: Thread-local handshakes
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Wed Oct 25 13:19:30 UTC 2017
Hi Robbin,
This change (with the addition of the poll at wide_ret) looks good. It
came out nicely in the code.
thanks,
Coleen
On 10/24/17 10:54 AM, Robbin Ehn wrote:
> Hi,
>
> I did a fix for the interpreter performance regression, it's plain and
> simple, I kept the polling code inside dispatch_base but made it
> optional as the verify oop.
>
> Incremental:
> http://cr.openjdk.java.net/~rehn/8185640/v5/Interpreter-Poll-7/webrev/index.html
>
>
> Manual tested with jstack and it passes: hotspot_tier1, hotspot_handshake
>
> It reduces the polling cost of 80%, sensitive benchmarks shows -0.44%
> regression vs TLH off. More insensitive benchmark show no regression.
>
> Thanks, Robbin
>
> On 2017-10-23 17:58, Karen Kinnear wrote:
>> Works for me
>>
>> Thanks,
>> Karen
>>
>>> On Oct 23, 2017, at 8:40 AM, Doerr, Martin <martin.doerr at sap.com>
>>> wrote:
>>>
>>> Hi Coleen and Robbin,
>>>
>>> I'm ok with putting it into a separate RFE. I understand that there
>>> are more fun activities than rebasing this XL change for a long time
>>> :-)
>>> So you don't need to delay it. It's acceptable for me.
>>>
>>> Thanks, Coleen, for sharing your proposal. I appreciate it.
>>>
>>> Best regards,
>>> Martin
>>>
>>>
>>> -----Original Message-----
>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>> Sent: Montag, 23. Oktober 2017 17:26
>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers
>>> <hotspot-dev at openjdk.java.net>
>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>
>>> Hi Martin,
>>>
>>>> On 2017-10-18 16:05, Doerr, Martin wrote:
>>>> Hi Robbin,
>>>>
>>>> thanks for the quick reply and for doing additional benchmarks.
>>>> Please note that t->does_dispatch() was just a first idea, but
>>>> doesn't really fit for the purpose because it's false for
>>>> conditional branch bytecodes for example. I just didn't find an
>>>> appropriate quick check in the existing code.
>>>> I guess you will notice a performance impact when benchmarking with
>>>> -Xint. (I don't know if Oracle usually runs startup performance
>>>> benchmarks.)
>>>
>>> Yes, we are seeing a performance regression, 2.5%-6% depending on
>>> benchmark.
>>> We are committed to fix this, but it might come as separate RFE/bug
>>> depending on
>>> the JEP's timeline.
>>>
>>> (If the fix, very unlikely, would not be done before next release,
>>> we would
>>> change the default to off)
>>>
>>> I hope this is an acceptable path?
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>> Sent: Mittwoch, 18. Oktober 2017 15:58
>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers
>>>> <hotspot-dev at openjdk.java.net>
>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>
>>>> Hi Martin,
>>>>
>>>>> On 2017-10-18 12:11, Doerr, Martin wrote:
>>>>> Hi Robbin,
>>>>>
>>>>> so you would like to push your version first (as it does not break
>>>>> other platforms) and then help us to push non-Oracle platform
>>>>> implementations which change shared code again?
>>>>> I'd be fine with that, too.
>>>>
>>>> Yes, great!
>>>>
>>>>>
>>>>> While thinking a little longer about the interpreter
>>>>> implementation, a new idea came into my mind.
>>>>> I think we could significantly reduce impact on interpreter code
>>>>> size and performance by using safepoint polls only in a subset of
>>>>> bytecodes. E.g., we could use only bytecodes which perform any
>>>>> kind of jump by implementing something like
>>>>> if (SafepointMechanism::uses_thread_local_poll() &&
>>>>> t->does_dispatch()) generate_safepoint_poll();
>>>>> in TemplateInterpreterGenerator::generate_and_dispatch.
>>>>
>>>> We have not seen any performance regression in simple benchmark
>>>> with this.
>>>> I will do a better benchmark and compare what difference it makes.
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Best regards,
>>>>> Martin
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Robbin Ehn [mailto:robbin.ehn at oracle.com]
>>>>> Sent: Mittwoch, 18. Oktober 2017 11:07
>>>>> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-dev developers
>>>>> <hotspot-dev at openjdk.java.net>
>>>>> Subject: Re: RFR(XL): 8185640: Thread-local handshakes
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>>> On 2017-10-17 19:58, Doerr, Martin wrote:
>>>>>> Hi Robbin,
>>>>>>
>>>>>> my first impression is very good. Thanks for providing the webrev.
>>>>>
>>>>> Great!
>>>>>
>>>>>>
>>>>>> I only don't like that "poll_page_val | poll_bit()" is used in
>>>>>> shared code. I'd prefer to use either one or the other mechanism.
>>>>>> Would it be ok to move the decision between what to use to
>>>>>> platform code?
>>>>>> (Some platforms could still use both if this is beneficial.)
>>>>>>
>>>>>> E.g. on PPC64, we'd like to use conditional trap instructions
>>>>>> with special bit patterns if UseSIGTRAP is on. Would be excellent
>>>>>> if we could implement set functions for _poll_armed_value and
>>>>>> _poll_disarmed_value in platform code. poll_bit() also fits
>>>>>> better into platform code in my opinion.
>>>>>
>>>>> I see no issue with this.
>>>>> Maybe SafepointMechanism::local_poll_armed should be possibly
>>>>> platform specific.
>>>>> Can we do this incremental when adding the platform support for
>>>>> PPC64?
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>>
>>>>>> Best regards,
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: hotspot-dev [mailto:hotspot-dev-bounces at openjdk.java.net]
>>>>>> On Behalf Of Robbin Ehn
>>>>>> Sent: Mittwoch, 11. Oktober 2017 15:38
>>>>>> To: hotspot-dev developers <hotspot-dev at openjdk.java.net>
>>>>>> Subject: RFR(XL): 8185640: Thread-local handshakes
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Starting the review of the code while JEP work is still not
>>>>>> completed.
>>>>>>
>>>>>> JEP: https://bugs.openjdk.java.net/browse/JDK-8185640
>>>>>>
>>>>>> This JEP introduces a way to execute a callback on threads
>>>>>> without performing a global VM safepoint. It makes it both
>>>>>> possible and cheap to stop individual threads and not
>>>>>> just all threads or none.
>>>>>>
>>>>>> Entire changeset:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/flat/
>>>>>>
>>>>>> Divided into 3-parts,
>>>>>> SafepointMechanism abstraction:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/SafepointMechanism-0/
>>>>>> Consolidating polling page allocation:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/PollingPage-1/
>>>>>> Handshakes:
>>>>>> http://cr.openjdk.java.net/~rehn/8185640/v0/Handshakes-2/
>>>>>>
>>>>>> A handshake operation is a callback that is executed for each
>>>>>> JavaThread while that thread is in a safepoint safe state. The
>>>>>> callback is executed either by the thread
>>>>>> itself or by the VM thread while keeping the thread in a blocked
>>>>>> state. The big difference between safepointing and handshaking is
>>>>>> that the per thread operation will be
>>>>>> performed on all threads as soon as possible and they will
>>>>>> continue to execute as soon as it’s own operation is completed.
>>>>>> If a JavaThread is known to be running, then a
>>>>>> handshake can be performed with that single JavaThread as well.
>>>>>>
>>>>>> The current safepointing scheme is modified to perform an
>>>>>> indirection through a per-thread pointer which will allow a
>>>>>> single thread's execution to be forced to trap on the
>>>>>> guard page. In order to force a thread to yield the VM updates
>>>>>> the per-thread pointer for the corresponding thread to point to
>>>>>> the guarded page.
>>>>>>
>>>>>> Example of potential use-cases:
>>>>>> -Biased lock revocation
>>>>>> -External requests for stack traces
>>>>>> -Deoptimization
>>>>>> -Async exception delivery
>>>>>> -External suspension
>>>>>> -Eliding memory barriers
>>>>>>
>>>>>> All of these will benefit the VM moving towards becoming more
>>>>>> low-latency friendly by reducing the number of global safepoints.
>>>>>> Platforms that do not yet implement the per JavaThread poll, a
>>>>>> fallback to normal safepoint is in place. HandshakeOneThread will
>>>>>> then be a normal safepoint. The supported
>>>>>> platforms are Linux x64 and Solaris SPARC.
>>>>>>
>>>>>> Tested heavily with various test suits and comes with a few new
>>>>>> tests.
>>>>>>
>>>>>> Performance testing using standardized benchmark show no
>>>>>> signification changes, the latest number was -0.7% on Linux x64
>>>>>> and +1.5% Solaris SPARC (not statistically
>>>>>> ensured). A minor regression for the load vs load load on x64 is
>>>>>> expected and a slight increase on SPARC due to the cost of
>>>>>> ‘materializing’ the page vs load load.
>>>>>> The time to trigger a safepoint was measured on a large machine
>>>>>> to not be an issue. The looping over threads and arming the
>>>>>> polling page will benefit from the work on
>>>>>> JavaThread life-cycle (8167108 - SMR and JavaThread Lifecycle:
>>>>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024773.html)
>>>>>> which puts all
>>>>>> JavaThreads in an array instead of a linked list.
>>>>>>
>>>>>> Thanks, Robbin
>>>>>>
>>
More information about the hotspot-dev
mailing list