RFR: 8225702(13): Safepoint counter can't be used for safepoint detection

Tue Jun 25 19:02:31 UTC 2019

Thanks David!

/Robbin

On 2019-06-25 20:55, David Holmes wrote:
> Incremental changes look good.
> 
> Thanks,
> David
> 
> On 25/06/2019 2:49 pm, Robbin Ehn wrote:
>> Hi David, (and Dan)
>>
>>>> But since this is JDK 13 bug I did the simplest thing.
>>>
>>> Yes this is best. To summarise what you and I discussed:
>>> - older usage of safepoint_counter only does "trivial" actions between reading the counts
>>> - the new usage allows more elaborate code between the reading of the counts and so hits the current problem
>>> - the new global safepoint-id will fix that code and not disrupt the other safepoint-counter users.
>>>
>>> Okay I'm fine with changes and your explanations.
>>
>> Thanks!
>>
>> David and Dan, here is v2 update from Dan's comments.
>> Full: http://cr.openjdk.java.net/~rehn/8225702/v2/full/
>> Inc : http://cr.openjdk.java.net/~rehn/8225702/v2/inc/
>>
>> Passed t1-3 on jdk13.
>>
>> Thanks, Robbin
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>>>
>>>>>> Also it makes no sense for a normal user that JFR/JMC safepoint id are only odd,
>>>>>> we instead use this new id which is just incremented for each safepoint.
>>>>>>
>>>>>> Code:
>>>>>> http://cr.openjdk.java.net/~rehn/8225702/v1/webrev/index.html
>>>>>
>>>>> src/hotspot/share/runtime/safepoint.hpp
>>>>>
>>>>> Why did you make thread_not_running an instance method?
>>>>
>>>> Since it reads the _safepoint_counter now that I made it private it needed access.
>>>>
>>>>>
>>>>> src/hotspot/share/runtime/safepoint.cpp
>>>>>
>>>>>   418   // Record state
>>>>>   419   _state = _synchronized;
>>>>>   420
>>>>>   421   OrderAccess::fence();
>>>>>   422
>>>>>   423   // Set the new id
>>>>>   424   ++_safepoint_id;
>>>>>
>>>>> Under the assumption that the threads/code that uses the new SafepointStateTracker do not participate in the 
>>>>> safepoint protocol, is there a risk of seeing a stale _safepoint_id, or racing with the increment? It's not clear 
>>>>> to me under what conditions the SafepointStateTracker::safepoint_state_changed() calls will be made.
>>>>
>>>> We write as:
>>>> state = synch
>>>> id++
>>>> state = unsynch  // read state here
>>>> state = synch
>>>> id++             // read id here
>>>> state = unsynch  // re-read state here
>>>>                   // re-read id here
>>>>
>>>> This would give a false negative, so it would not work for a JT in
>>>> native/blocked or a NonJavaThread. With doing a 'stable load' thus reading id
>>>> twice. So we should assert this.
>>>>
>>>> Encoding state and counter in the same primitive which can be atomically
>>>> updated, so we never see the same value in a long enough period as you suggested
>>>> fixes this. I believe doing the same for JavaThread state can also simplify
>>>> somethings.
>>>>
>>>> But I don't think we should do that as a bug fix for 13.
>>>> And I was going to use the safepoint id in logging as gc does.
>>>> So even if we don't use _safepoint_id to track safepoint state in later JDK
>>>> we have pretty numbers in logs and JFR.
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Thanks,
>>>>> David
>>>>> -----
>>>>>
>>>>>> Issue:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8225702
>>>>>>
>>>>>> Passes JFR test and t1-3.
>>>>>>
>>>>>> Thanks, Robbin