RFR 8194085: Obsolete the deprecated SafepointSynchronize flags and remove related code
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Feb 27 13:59:37 UTC 2018
Thanks for filing the followup bug...
Dan
On 2/26/18 11:03 PM, David Holmes wrote:
> Good questions Dan! I've filed:
>
> https://bugs.openjdk.java.net/browse/JDK-8198730
>
> To follow up on this - and apologies that I dropped the check for the
> global safepoint when proposing this.
>
> Thanks,
> David
>
> On 27/02/2018 3:13 AM, Daniel D. Daugherty wrote:
>> On 2/26/18 12:54 AM, David Holmes wrote:
>>> On 24/02/2018 6:00 AM, Daniel D. Daugherty wrote:
>>>> On 2/23/18 10:41 AM, harold seigel wrote:
>>>>>
>>>>> Thanks Coleen, Dan, and David for your comments!
>>>>>
>>>>> Please review this updated webrev for bug JDK-8194085.
>>>>>
>>>>> http://cr.openjdk.java.net/~hseigel/bug_8194085.2/webrev/index.html
>>>>>
>>>>
>>>> src/hotspot/share/runtime/arguments.cpp
>>>> No comments.
>>>>
>>>> src/hotspot/share/runtime/globals.hpp
>>>> No comments.
>>>>
>>>> src/hotspot/share/runtime/safepoint.cpp
>>>> L312: if (int(iterations) == -1) { // overflow -
>>>> something is wrong.
>>>> L313: guarantee (PageArmed == 0, "invariant"); //
>>>> historical form
>>>> L314: }
>>>> I now see what David is talking about w.r.t. restoring
>>>> this much of the original code. I have a slightly
>>>> different recommendation for commenting:
>>>>
>>>> if (int(iterations) == -1) {
>>>> // We can only overflow here when we are not using global
>>>> // polling pages. We keep this guarantee in its original
>>>
>>> Surely you mean "when we ARE using global polling pages"? Thisd
>>> pre-dates per-thread safepoints/handshakes.
>>
>> Sigh... yes, my new comment is in error and I'm getting frustrated
>> with how to rewrite it properly.
>>
>> The original code had this block:
>>
>> L190: if (SafepointMechanism::uses_global_page_poll()) {
>> <snip>
>> L194: if (DeferPollingPageLoopCount < 0) {
>> L195: // Make polling safepoint aware
>> L196: guarantee (PageArmed == 0, "invariant") ;
>> L197: PageArmed = 1 ;
>> L198: os::make_polling_page_unreadable();
>> L199: }
>> L200: }
>>
>> which would arm the polling page when DeferPollingPageLoopCount
>> was less than zero (the default is -1).
>>
>> The original block in the loop looked like this:
>>
>> L312: if (SafepointMechanism::uses_global_page_poll() &&
>> int(iterations) == DeferPollingPageLoopCount) {
>> L313: guarantee (PageArmed == 0, "invariant") ;
>> L314: PageArmed = 1 ;
>> L315: os::make_polling_page_unreadable();
>> L316: }
>>
>> which would enter the block when iterations hit
>> DeferPollingPageLoopCount.
>> If DeferPollingPageLoopCount == -1 (the default), that meant that
>> iterations overflowed. It also meant that the polling page was
>> already armed (because of L194-198) and we would fail the guarantee
>> at L313.
>>
>> So when DeferPollingPageLoopCount == -1 and iterations overflowed,
>> that meant we were in the loop for a very long time and the
>> guarantee failure is the diagnostic that told folks that we were
>> stuck. It's that diagnostic that David H. wants to preserve.
>>
>>
>> In the new code, this if-statement:
>>
>> L194: if (DeferPollingPageLoopCount < 0) {
>>
>> is gone so we're always arming the polling page when
>> SafepointMechanism::uses_global_page_poll() == true so
>> we don't need L314 or L315.
>>
>> The if-statement in the loop has been rewritten to this:
>>
>> L312: if (int(iterations) == -1) { // overflow - something
>> is wrong.
>> L313: guarantee (PageArmed == 0, "invariant"); //
>> historical form
>> L314: }
>>
>> so we're catching overflow regardless of the value of
>> SafepointMechanism::uses_global_page_poll(). I previously
>> convinced myself that iterations could only overflow when
>> SafepointMechanism::uses_global_page_poll() == true so
>> I proposed a new comment (with the typo fixed):
>>
>> // We can only overflow here when we are using
>> global
>> // polling pages. We keep this guarantee in its
>> original
>> // form so that searches of the bug database for
>> this
>> // failure mode find the right bugs.
>>
>> I'm no longer convinced that comment is entirely correct! (Sorry)
>>
>> Before the fix, we only got to this guarantee when global polling
>> pages were in use:
>>
>> L313: guarantee (PageArmed == 0, "invariant") ;
>>
>> so we only detected overflow when global polling pages were in use.
>> That's not the same thing as saying that overflow can only happen
>> when we are using global polling pages.
>>
>> A slightly better comment is this:
>>
>> // We only fail on overflow here when we are
>> using global
>> // polling pages. We keep this guarantee in its
>> original
>> // form so that searches of the bug database for
>> this
>> // failure mode find the right bugs.
>>
>> So that raises two questions:
>>
>> 1) is it possible to overflow iterations when not using global
>> polling pages?
>> 2) should we fail a guarantee if that happens?
>>
>> I would be fine with Harold using the slightly better comment above and
>> pushing his fix. After all, his fix is about removing options and not
>> about detangling the mess that is safepointing... :-)
>>
>> Dan
>>
>>>
>>> Thanks,
>>> David
>>>
>>>> // form so that searches of the bug database for this
>>>> // failure mode find the right bugs.
>>>> guarantee(PageArmed == 0, "invariant");
>>>> }
>>>>
>>>> src/hotspot/share/runtime/safepoint.hpp
>>>> No comments.
>>>>
>>>> Thumbs up!
>>>>
>>>> Dan
>>>>
>>>>
>>>>> This updated webrev contains the changes suggested by Coleen, Dan,
>>>>> and David.
>>>>>
>>>>> Thanks, Harold
>>>>>
>>>>> On 2/22/2018 9:16 PM, David Holmes wrote:
>>>>>> I should clarify expectation here. What I want to see is
>>>>>> something like this:
>>>>>>
>>>>>> 310 // to drive subsequent
>>>>>> spin/SwitchThreadTo()/Sleep(N) decisions.
>>>>>> 311
>>>>>> 312 if (int(iterations) == -1) { // overflow -
>>>>>> something is wrong
>>>>>> 313 guarantee (PageArmed == 0, "invariant") ; //
>>>>>> historical form
>>>>>> 314 }
>>>>>>
>>>>>> I recommend keeping the same guarantee condition so that it can
>>>>>> be easily searched for in bug reports - even though it seems a
>>>>>> little out of place given we don't arm the page here.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 23/02/2018 11:53 AM, David Holmes wrote:
>>>>>>> On 23/02/2018 12:47 AM, Daniel D. Daugherty wrote:
>>>>>>>> On 2/21/18 11:49 PM, David Holmes wrote:
>>>>>>>>> Hi Harold,
>>>>>>>>>
>>>>>>>>> On 22/02/2018 4:31 AM, harold seigel wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Please review this JDK-11 change to obsolete the
>>>>>>>>>> SafepointSpinBeforeYield, DeferThrSuspendLoopCount, and
>>>>>>>>>> DeferPollingPageLoopCount options and related code. With this
>>>>>>>>>> change, these options are still accepted on the command line
>>>>>>>>>> but have no affect other than to generate these warning
>>>>>>>>>> messages:
>>>>>>>>>>
>>>>>>>>>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>>>> SafepointSpinBeforeYield; support was removed in 11.0
>>>>>>>>>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>>>> DeferThrSuspendLoopCount; support was removed in 11.0
>>>>>>>>>> Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>>>> DeferPollingPageLoopCount; support was removed in 11.0
>>>>>>>>>
>>>>>>>>> Just for the record these were once useful tuning values for
>>>>>>>>> investigating safepoint performance issues in the field.
>>>>>>>>>
>>>>>>>>>> Open Webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8194085/webrev/index.html
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This one is a bit messier than I had envisaged: -Xconcurrentio
>>>>>>>>> ?? Wow! :) Definitely overlooked that one. It should be
>>>>>>>>> deprecated in 11 for certain.
>>>>>>>>>
>>>>>>>>> I have one problem with a change in safepoint.cpp, the removal
>>>>>>>>> of this:
>>>>>>>>>
>>>>>>>>> 312 if (SafepointMechanism::uses_global_page_poll()
>>>>>>>>> && int(iterations) == DeferPollingPageLoopCount) {
>>>>>>>>> 313 guarantee (PageArmed == 0, "invariant") ;
>>>>>>>>> 314 PageArmed = 1 ;
>>>>>>>>> 315 os::make_polling_page_unreadable();
>>>>>>>>> 316 }
>>>>>>>>>
>>>>>>>>> This block triggers a guarantee failure if we fail to reach a
>>>>>>>>> safepoint and the number of iterations overflows to reach -1
>>>>>>>>> (the value of DeferPollingPageLoopCount). That is the only
>>>>>>>>> thing that stops the VM from just hanging when we have an
>>>>>>>>> issue reaching safepoints (like all of the "counted loop"
>>>>>>>>> problems!) and is very valuable in clearly identifying the
>>>>>>>>> problem. So I strongly recommend that this is restored with an
>>>>>>>>> explicit check against -1.
>>>>>>>>
>>>>>>>> Hmmmm... I guess I'm not seeing why the changes here don't
>>>>>>>> solve that problem:
>>>>>>>>
>>>>>>>>> 190 if (SafepointMechanism::uses_global_page_poll()) {
>>>>>>>>> 191 // Make interpreter safepoint aware
>>>>>>>>> 192 Interpreter::notice_safepoints();
>>>>>>>>> 193
>>>>>>>>> 194 if (DeferPollingPageLoopCount < 0) {
>>>>>>>>> 195 // Make polling safepoint aware
>>>>>>>>> 196 guarantee (PageArmed == 0, "invariant") ;
>>>>>>>>> 197 PageArmed = 1 ;
>>>>>>>>> 198 os::make_polling_page_unreadable();
>>>>>>>>> 199 }
>>>>>>>>> 200 }
>>>>>>>>
>>>>>>>> So L194 and L200 are deleted and the same block is always
>>>>>>>> run when SafepointMechanism::uses_global_page_poll() is true
>>>>>>>> and this happens before the "L221: while(still_running > 0) {"
>>>>>>>>
>>>>>>>> Did I miss something?
>>>>>>>
>>>>>>> That block is executed once.
>>>>>>>
>>>>>>> The deleted block is executed in the loop while trying to reach
>>>>>>> the safepoint and so detects excessive looping because we can't
>>>>>>> get to the safepoint.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>>
>>>>>>>>> src/hotspot/share/runtime/safepoint.hpp
>>>>>>>>>
>>>>>>>>> + static void set_throughput_mode() {
>>>>>>>>> + _defer_thr_suspend_loop_count = 1;
>>>>>>>>> + }
>>>>>>>>>
>>>>>>>>> First can you add a comment before this that it is only used
>>>>>>>>> for -Xconcurrentio support.
>>>>>>>>>
>>>>>>>>> Second, the name seems odd - throughput mode? How about
>>>>>>>>> set_concurrentio_mode() ? Or even just
>>>>>>>>> set_defer_thr_suspend_loop_count() ?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>> -----
>>>>>>>>>
>>>>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8194085
>>>>>>>>>>
>>>>>>>>>> The change was tested with Mach5 tiers 1 and 2 on all Mach5
>>>>>>>>>> platforms and tiers 3-5 in Linux-X64.
>>>>>>>>>>
>>>>>>>>>> Thanks, Harold
>>>>>>>>
>>>>>
>>>>
>>
More information about the hotspot-runtime-dev
mailing list