RFR 8194085: Obsolete the deprecated SafepointSynchronize flags and remove related code

David Holmes david.holmes at oracle.com
Tue Feb 27 04:03:02 UTC 2018


Good questions Dan! I've filed:

https://bugs.openjdk.java.net/browse/JDK-8198730

To follow up on this - and apologies that I dropped the check for the 
global safepoint when proposing this.

Thanks,
David

On 27/02/2018 3:13 AM, Daniel D. Daugherty wrote:
> On 2/26/18 12:54 AM, David Holmes wrote:
>> On 24/02/2018 6:00 AM, Daniel D. Daugherty wrote:
>>> On 2/23/18 10:41 AM, harold seigel wrote:
>>>>
>>>> Thanks Coleen, Dan, and David for your comments!
>>>>
>>>> Please review this updated webrev for bug JDK-8194085.
>>>>
>>>> http://cr.openjdk.java.net/~hseigel/bug_8194085.2/webrev/index.html
>>>>
>>>
>>> src/hotspot/share/runtime/arguments.cpp
>>>      No comments.
>>>
>>> src/hotspot/share/runtime/globals.hpp
>>>      No comments.
>>>
>>> src/hotspot/share/runtime/safepoint.cpp
>>>      L312:          if (int(iterations) == -1) { // overflow - 
>>> something is wrong.
>>>      L313:             guarantee (PageArmed == 0, "invariant"); // 
>>> historical form
>>>      L314:          }
>>>          I now see what David is talking about w.r.t. restoring
>>>          this much of the original code. I have a slightly
>>>          different recommendation for commenting:
>>>
>>>          if (int(iterations) == -1) {
>>>            // We can only overflow here when we are not using global
>>>            // polling pages. We keep this guarantee in its original
>>
>> Surely you mean "when we ARE using global polling pages"? Thisd 
>> pre-dates per-thread safepoints/handshakes.
> 
> Sigh... yes, my new comment is in error and I'm getting frustrated
> with how to rewrite it properly.
> 
> The original code had this block:
> 
>   L190:     if (SafepointMechanism::uses_global_page_poll()) {
>   <snip>
>   L194:       if (DeferPollingPageLoopCount < 0) {
>   L195:         // Make polling safepoint aware
>   L196:         guarantee (PageArmed == 0, "invariant") ;
>   L197:         PageArmed = 1 ;
>   L198:         os::make_polling_page_unreadable();
>   L199:       }
>   L200:     }
> 
> which would arm the polling page when DeferPollingPageLoopCount
> was less than zero (the default is -1).
> 
> The original block in the loop looked like this:
> 
>   L312:           if (SafepointMechanism::uses_global_page_poll() && 
> int(iterations) == DeferPollingPageLoopCount) {
>   L313:             guarantee (PageArmed == 0, "invariant") ;
>   L314:             PageArmed = 1 ;
>   L315:             os::make_polling_page_unreadable();
>   L316:           }
> 
> which would enter the block when iterations hit DeferPollingPageLoopCount.
> If DeferPollingPageLoopCount == -1 (the default), that meant that
> iterations overflowed. It also meant that the polling page was
> already armed (because of L194-198) and we would fail the guarantee
> at L313.
> 
> So when DeferPollingPageLoopCount == -1 and iterations overflowed,
> that meant we were in the loop for a very long time and the
> guarantee failure is the diagnostic that told folks that we were
> stuck. It's that diagnostic that David H. wants to preserve.
> 
> 
> In the new code, this if-statement:
> 
> L194:       if (DeferPollingPageLoopCount < 0) {
> 
> is gone so we're always arming the polling page when
> SafepointMechanism::uses_global_page_poll() == true so
> we don't need L314 or L315.
> 
> The if-statement in the loop has been rewritten to this:
> 
>   L312:          if (int(iterations) == -1) { // overflow - something is 
> wrong.
>   L313:             guarantee (PageArmed == 0, "invariant"); // 
> historical form
>   L314:          }
> 
> so we're catching overflow regardless of the value of
> SafepointMechanism::uses_global_page_poll(). I previously
> convinced myself that iterations could only overflow when
> SafepointMechanism::uses_global_page_poll() == true so
> I proposed a new comment (with the typo fixed):
> 
>                     // We can only overflow here when we are using global
>                     // polling pages. We keep this guarantee in its 
> original
>                     // form so that searches of the bug database for this
>                     // failure mode find the right bugs.
> 
> I'm no longer convinced that comment is entirely correct! (Sorry)
> 
> Before the fix, we only got to this guarantee when global polling
> pages were in use:
> 
> L313:             guarantee (PageArmed == 0, "invariant") ;
> 
> so we only detected overflow when global polling pages were in use.
> That's not the same thing as saying that overflow can only happen
> when we are using global polling pages.
> 
> A slightly better comment is this:
> 
>                     // We only fail on overflow here when we are using 
> global
>                     // polling pages. We keep this guarantee in its 
> original
>                     // form so that searches of the bug database for this
>                     // failure mode find the right bugs.
> 
> So that raises two questions:
> 
> 1) is it possible to overflow iterations when not using global polling 
> pages?
> 2) should we fail a guarantee if that happens?
> 
> I would be fine with Harold using the slightly better comment above and
> pushing his fix. After all, his fix is about removing options and not
> about detangling the mess that is safepointing... :-)
> 
> Dan
> 
>>
>> Thanks,
>> David
>>
>>>            // form so that searches of the bug database for this
>>>            // failure mode find the right bugs.
>>>            guarantee(PageArmed == 0, "invariant");
>>>          }
>>>
>>> src/hotspot/share/runtime/safepoint.hpp
>>>      No comments.
>>>
>>> Thumbs up!
>>>
>>> Dan
>>>
>>>
>>>> This updated webrev contains the changes suggested by Coleen, Dan, 
>>>> and David.
>>>>
>>>> Thanks, Harold
>>>>
>>>> On 2/22/2018 9:16 PM, David Holmes wrote:
>>>>> I should clarify expectation here. What I want to see is something 
>>>>> like this:
>>>>>
>>>>> 310           //    to drive subsequent 
>>>>> spin/SwitchThreadTo()/Sleep(N) decisions.
>>>>> 311
>>>>> 312           if (int(iterations) == -1) { // overflow - something 
>>>>> is wrong
>>>>> 313             guarantee (PageArmed == 0, "invariant") ; // 
>>>>> historical form
>>>>> 314           }
>>>>>
>>>>> I recommend keeping the same guarantee condition so that it can be 
>>>>> easily searched for in bug reports - even though it seems a little 
>>>>> out of place given we don't arm the page here.
>>>>>
>>>>> David
>>>>>
>>>>> On 23/02/2018 11:53 AM, David Holmes wrote:
>>>>>> On 23/02/2018 12:47 AM, Daniel D. Daugherty wrote:
>>>>>>> On 2/21/18 11:49 PM, David Holmes wrote:
>>>>>>>> Hi Harold,
>>>>>>>>
>>>>>>>> On 22/02/2018 4:31 AM, harold seigel wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please review this JDK-11 change to obsolete the 
>>>>>>>>> SafepointSpinBeforeYield, DeferThrSuspendLoopCount, and 
>>>>>>>>> DeferPollingPageLoopCount options and related code. With this 
>>>>>>>>> change, these options are still accepted on the command line 
>>>>>>>>> but have no affect other than to generate these warning messages:
>>>>>>>>>
>>>>>>>>>     Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>>>     SafepointSpinBeforeYield; support was removed in 11.0
>>>>>>>>>     Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>>>     DeferThrSuspendLoopCount; support was removed in 11.0
>>>>>>>>>     Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option
>>>>>>>>>     DeferPollingPageLoopCount; support was removed in 11.0
>>>>>>>>
>>>>>>>> Just for the record these were once useful tuning values for 
>>>>>>>> investigating safepoint performance issues in the field.
>>>>>>>>
>>>>>>>>> Open Webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~hseigel/bug_8194085/webrev/index.html
>>>>>>>>
>>>>>>>> This one is a bit messier than I had envisaged: -Xconcurrentio 
>>>>>>>> ?? Wow! :) Definitely overlooked that one. It should be 
>>>>>>>> deprecated in 11 for certain.
>>>>>>>>
>>>>>>>> I have one problem with a change in safepoint.cpp, the removal 
>>>>>>>> of this:
>>>>>>>>
>>>>>>>>  312           if (SafepointMechanism::uses_global_page_poll() 
>>>>>>>> && int(iterations) == DeferPollingPageLoopCount) {
>>>>>>>>  313             guarantee (PageArmed == 0, "invariant") ;
>>>>>>>>  314             PageArmed = 1 ;
>>>>>>>>  315             os::make_polling_page_unreadable();
>>>>>>>>  316           }
>>>>>>>>
>>>>>>>> This block triggers a guarantee failure if we fail to reach a 
>>>>>>>> safepoint and the number of iterations overflows to reach -1 
>>>>>>>> (the value of DeferPollingPageLoopCount). That is the only thing 
>>>>>>>> that stops the VM from just hanging when we have an issue 
>>>>>>>> reaching safepoints (like all of the "counted loop" problems!) 
>>>>>>>> and is very valuable in clearly identifying the problem. So I 
>>>>>>>> strongly recommend that this is restored with an explicit check 
>>>>>>>> against -1.
>>>>>>>
>>>>>>> Hmmmm... I guess I'm not seeing why the changes here don't
>>>>>>> solve that problem:
>>>>>>>
>>>>>>>>   190     if (SafepointMechanism::uses_global_page_poll()) {
>>>>>>>>   191       // Make interpreter safepoint aware
>>>>>>>>   192       Interpreter::notice_safepoints();
>>>>>>>>   193
>>>>>>>> 194 if (DeferPollingPageLoopCount < 0) {
>>>>>>>>   195         // Make polling safepoint aware
>>>>>>>>   196         guarantee (PageArmed == 0, "invariant") ;
>>>>>>>>   197         PageArmed = 1 ;
>>>>>>>>   198         os::make_polling_page_unreadable();
>>>>>>>>   199       }
>>>>>>>> 200 }
>>>>>>>
>>>>>>> So L194 and L200 are deleted and the same block is always
>>>>>>> run when SafepointMechanism::uses_global_page_poll() is true
>>>>>>> and this happens before the "L221: while(still_running > 0) {"
>>>>>>>
>>>>>>> Did I miss something?
>>>>>>
>>>>>> That block is executed once.
>>>>>>
>>>>>> The deleted block is executed in the loop while trying to reach 
>>>>>> the safepoint and so detects excessive looping because we can't 
>>>>>> get to the safepoint.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> src/hotspot/share/runtime/safepoint.hpp
>>>>>>>>
>>>>>>>> +   static void set_throughput_mode() {
>>>>>>>> +     _defer_thr_suspend_loop_count = 1;
>>>>>>>> +   }
>>>>>>>>
>>>>>>>> First can you add a comment before this that it is only used for 
>>>>>>>> -Xconcurrentio support.
>>>>>>>>
>>>>>>>> Second, the name seems odd - throughput mode? How about 
>>>>>>>> set_concurrentio_mode() ? Or even just 
>>>>>>>> set_defer_thr_suspend_loop_count() ?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>> JBS Bug: https://bugs.openjdk.java.net/browse/JDK-8194085
>>>>>>>>>
>>>>>>>>> The change was tested with Mach5 tiers 1 and 2 on all Mach5 
>>>>>>>>> platforms and tiers 3-5 in Linux-X64.
>>>>>>>>>
>>>>>>>>> Thanks, Harold
>>>>>>>
>>>>
>>>
> 


More information about the hotspot-runtime-dev mailing list