RFR 8194085: Obsolete the deprecated SafepointSynchronize flags and remove related code
Daniel D. Daugherty
daniel.daugherty at oracle.com
Mon Feb 26 17:13:27 UTC 2018
On 2/26/18 12:54 AM, David Holmes wrote:
> On 24/02/2018 6:00 AM, Daniel D. Daugherty wrote:
>> On 2/23/18 10:41 AM, harold seigel wrote:
>>>
>>> Thanks Coleen, Dan, and David for your comments!
>>>
>>> Please review this updated webrev for bug JDK-8194085.
>>>
>>> http://cr.openjdk.java.net/~hseigel/bug_8194085.2/webrev/index.html
>>>
>>
>> src/hotspot/share/runtime/arguments.cpp
>> No comments.
>>
>> src/hotspot/share/runtime/globals.hpp
>> No comments.
>>
>> src/hotspot/share/runtime/safepoint.cpp
>> L312: if (int(iterations) == -1) { // overflow -
>> something is wrong.
>> L313: guarantee (PageArmed == 0, "invariant"); //
>> historical form
>> L314: }
>> I now see what David is talking about w.r.t. restoring
>> this much of the original code. I have a slightly
>> different recommendation for commenting:
>>
>> if (int(iterations) == -1) {
>> // We can only overflow here when we are not using global
>> // polling pages. We keep this guarantee in its original
>
> Surely you mean "when we ARE using global polling pages"? Thisd
> pre-dates per-thread safepoints/handshakes.
Sigh... yes, my new comment is in error and I'm getting frustrated
with how to rewrite it properly.
The original code had this block:
L190: if (SafepointMechanism::uses_global_page_poll()) {
<snip>
L194: if (DeferPollingPageLoopCount < 0) {
L195: // Make polling safepoint aware
L196: guarantee (PageArmed == 0, "invariant") ;
L197: PageArmed = 1 ;
L198: os::make_polling_page_unreadable();
L199: }
L200: }
which would arm the polling page when DeferPollingPageLoopCount
was less than zero (the default is -1).
The original block in the loop looked like this:
L312: if (SafepointMechanism::uses_global_page_poll() &&
int(iterations) == DeferPollingPageLoopCount) {
L313: guarantee (PageArmed == 0, "invariant") ;
L314: PageArmed = 1 ;
L315: os::make_polling_page_unreadable();
L316: }
which would enter the block when iterations hit DeferPollingPageLoopCount.
If DeferPollingPageLoopCount == -1 (the default), that meant that
iterations overflowed. It also meant that the polling page was
already armed (because of L194-198) and we would fail the guarantee
at L313.
So when DeferPollingPageLoopCount == -1 and iterations overflowed,
that meant we were in the loop for a very long time and the
guarantee failure is the diagnostic that told folks that we were
stuck. It's that diagnostic that David H. wants to preserve.
In the new code, this if-statement:
L194: if (DeferPollingPageLoopCount < 0) {
is gone so we're always arming the polling page when
SafepointMechanism::uses_global_page_poll() == true so
we don't need L314 or L315.
The if-statement in the loop has been rewritten to this:
L312: if (int(iterations) == -1) { // overflow - something is
wrong.
L313: guarantee (PageArmed == 0, "invariant"); //
historical form
L314: }
so we're catching overflow regardless of the value of
SafepointMechanism::uses_global_page_poll(). I previously
convinced myself that iterations could only overflow when
SafepointMechanism::uses_global_page_poll() == true so
I proposed a new comment (with the typo fixed):
// We can only overflow here when we are using global
// polling pages. We keep this guarantee in its original
// form so that searches of the bug database for this
// failure mode find the right bugs.
I'm no longer convinced that comment is entirely correct! (Sorry)
Before the fix, we only got to this guarantee when global polling
pages were in use:
L313: guarantee (PageArmed == 0, "invariant") ;
so we only detected overflow when global polling pages were in use.
That's not the same thing as saying that overflow can only happen
when we are using global polling pages.
A slightly better comment is this:
// We only fail on overflow here when we are using
global
// polling pages. We keep this guarantee in its original
// form so that searches of the bug database for this
// failure mode find the right bugs.
So that raises two questions:
1) is it possible to overflow iterations when not using global polling
pages?
2) should we fail a guarantee if that happens?
I would be fine with Harold using the slightly better comment above and
pushing his fix. After all, his fix is about removing options and not
about detangling the mess that is safepointing... :-)
Dan
>
> Thanks,
> David
>
>> // form so that searches of the bug database for this
>> // failure mode find the right bugs.
>> guarantee(PageArmed == 0, "invariant");
>> }
>>
>> src/hotspot/share/runtime/safepoint.hpp
>> No comments.
>>
>> Thumbs up!
>>
>> Dan
>>
>>
>>> This updated webrev contains the changes suggested by Coleen, Dan,
>>> and David.
>>>
>>> Thanks, Harold
>>>
>>> On 2/22/2018 9:16 PM, David Holmes wrote:
>>>> I should clarify expectation here. What I want to see is something
>>>> like this:
>>>>
>>>> 310 // to drive subsequent
>>>> spin/SwitchThreadTo()/Sleep(N) decisions.
>>>> 311
>>>> 312 if (int(iterations) == -1) { // overflow - something
>>>> is wrong
>>>> 313 guarantee (PageArmed == 0, "invariant") ; //
>>>> historical form
>>>> 314 }
>>>>
>>>> I recommend keeping the same guarantee condition so that it can be
>>>> easily searched for in bug reports - even though it seems a little
>>>> out of place given we don't arm the page here.
>>>>
>>>> David
>>>>
>>>> On 23/02/2018 11:53 AM, David Holmes wrote:
>>>>> On 23/02/2018 12:47 AM, Daniel D. Daugherty wrote:
>>>>>> On 2/21/18 11:49 PM, David Holmes wrote:
>>>>>>> Hi Harold,
>>>>>>>
>>>>>>> On 22/02/2018 4:31 AM, harold seigel wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Please review this JDK-11 change to obsolete the
>>>>>>>> SafepointSpinBeforeYield, DeferThrSuspendLoopCount, and
>>>>>>>> DeferPollingPageLoopCount options and related code. With this
>>>>>>>> change, these options are still accepted on the command line
>>>>>>>> but have no affect other than to generate these warning messages:
>>>>>>>>
>>>>>>>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>> SafepointSpinBeforeYield; support was removed in 11.0
>>>>>>>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>> DeferThrSuspendLoopCount; support was removed in 11.0
>>>>>>>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>> DeferPollingPageLoopCount; support was removed in 11.0
>>>>>>>
>>>>>>> Just for the record these were once useful tuning values for
>>>>>>> investigating safepoint performance issues in the field.
>>>>>>>
>>>>>>>> Open Webrev:
>>>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8194085/webrev/index.html
>>>>>>>
>>>>>>> This one is a bit messier than I had envisaged: -Xconcurrentio
>>>>>>> ?? Wow! :) Definitely overlooked that one. It should be
>>>>>>> deprecated in 11 for certain.
>>>>>>>
>>>>>>> I have one problem with a change in safepoint.cpp, the removal
>>>>>>> of this:
>>>>>>>
>>>>>>> 312 if (SafepointMechanism::uses_global_page_poll()
>>>>>>> && int(iterations) == DeferPollingPageLoopCount) {
>>>>>>> 313 guarantee (PageArmed == 0, "invariant") ;
>>>>>>> 314 PageArmed = 1 ;
>>>>>>> 315 os::make_polling_page_unreadable();
>>>>>>> 316 }
>>>>>>>
>>>>>>> This block triggers a guarantee failure if we fail to reach a
>>>>>>> safepoint and the number of iterations overflows to reach -1
>>>>>>> (the value of DeferPollingPageLoopCount). That is the only thing
>>>>>>> that stops the VM from just hanging when we have an issue
>>>>>>> reaching safepoints (like all of the "counted loop" problems!)
>>>>>>> and is very valuable in clearly identifying the problem. So I
>>>>>>> strongly recommend that this is restored with an explicit check
>>>>>>> against -1.
>>>>>>
>>>>>> Hmmmm... I guess I'm not seeing why the changes here don't
>>>>>> solve that problem:
>>>>>>
>>>>>>> 190 if (SafepointMechanism::uses_global_page_poll()) {
>>>>>>> 191 // Make interpreter safepoint aware
>>>>>>> 192 Interpreter::notice_safepoints();
>>>>>>> 193
>>>>>>> 194 if (DeferPollingPageLoopCount < 0) {
>>>>>>> 195 // Make polling safepoint aware
>>>>>>> 196 guarantee (PageArmed == 0, "invariant") ;
>>>>>>> 197 PageArmed = 1 ;
>>>>>>> 198 os::make_polling_page_unreadable();
>>>>>>> 199 }
>>>>>>> 200 }
>>>>>>
>>>>>> So L194 and L200 are deleted and the same block is always
>>>>>> run when SafepointMechanism::uses_global_page_poll() is true
>>>>>> and this happens before the "L221: while(still_running > 0) {"
>>>>>>
>>>>>> Did I miss something?
>>>>>
>>>>> That block is executed once.
>>>>>
>>>>> The deleted block is executed in the loop while trying to reach
>>>>> the safepoint and so detects excessive looping because we can't
>>>>> get to the safepoint.
>>>>>
>>>>> David
>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> ---
>>>>>>>
>>>>>>> src/hotspot/share/runtime/safepoint.hpp
>>>>>>>
>>>>>>> + static void set_throughput_mode() {
>>>>>>> + _defer_thr_suspend_loop_count = 1;
>>>>>>> + }
>>>>>>>
>>>>>>> First can you add a comment before this that it is only used for
>>>>>>> -Xconcurrentio support.
>>>>>>>
>>>>>>> Second, the name seems odd - throughput mode? How about
>>>>>>> set_concurrentio_mode() ? Or even just
>>>>>>> set_defer_thr_suspend_loop_count() ?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8194085
>>>>>>>>
>>>>>>>> The change was tested with Mach5 tiers 1 and 2 on all Mach5
>>>>>>>> platforms and tiers 3-5 in Linux-X64.
>>>>>>>>
>>>>>>>> Thanks, Harold
>>>>>>
>>>
>>
More information about the hotspot-runtime-dev
mailing list