RFR: 8225702(13): Safepoint counter can't be used for safepoint detection

Robbin Ehn robbin.ehn at oracle.com
Tue Jun 25 18:49:28 UTC 2019


Hi David, (and Dan)

>> But since this is JDK 13 bug I did the simplest thing.
> 
> Yes this is best. To summarise what you and I discussed:
> - older usage of safepoint_counter only does "trivial" actions between reading the counts
> - the new usage allows more elaborate code between the reading of the counts and so hits the current problem
> - the new global safepoint-id will fix that code and not disrupt the other safepoint-counter users.
> 
> Okay I'm fine with changes and your explanations.

Thanks!

David and Dan, here is v2 update from Dan's comments.
Full: http://cr.openjdk.java.net/~rehn/8225702/v2/full/
Inc : http://cr.openjdk.java.net/~rehn/8225702/v2/inc/

Passed t1-3 on jdk13.

Thanks, Robbin

> 
> Thanks,
> David
> -----
> 
>>>
>>>> Also it makes no sense for a normal user that JFR/JMC safepoint id are only odd,
>>>> we instead use this new id which is just incremented for each safepoint.
>>>>
>>>> Code:
>>>> http://cr.openjdk.java.net/~rehn/8225702/v1/webrev/index.html
>>>
>>> src/hotspot/share/runtime/safepoint.hpp
>>>
>>> Why did you make thread_not_running an instance method?
>>
>> Since it reads the _safepoint_counter now that I made it private it needed access.
>>
>>>
>>> src/hotspot/share/runtime/safepoint.cpp
>>>
>>>   418   // Record state
>>>   419   _state = _synchronized;
>>>   420
>>>   421   OrderAccess::fence();
>>>   422
>>>   423   // Set the new id
>>>   424   ++_safepoint_id;
>>>
>>> Under the assumption that the threads/code that uses the new SafepointStateTracker do not participate in the 
>>> safepoint protocol, is there a risk of seeing a stale _safepoint_id, or racing with the increment? It's not clear to 
>>> me under what conditions the SafepointStateTracker::safepoint_state_changed() calls will be made.
>>
>> We write as:
>> state = synch
>> id++
>> state = unsynch  // read state here
>> state = synch
>> id++             // read id here
>> state = unsynch  // re-read state here
>>                   // re-read id here
>>
>> This would give a false negative, so it would not work for a JT in
>> native/blocked or a NonJavaThread. With doing a 'stable load' thus reading id
>> twice. So we should assert this.
>>
>> Encoding state and counter in the same primitive which can be atomically
>> updated, so we never see the same value in a long enough period as you suggested
>> fixes this. I believe doing the same for JavaThread state can also simplify
>> somethings.
>>
>> But I don't think we should do that as a bug fix for 13.
>> And I was going to use the safepoint id in logging as gc does.
>> So even if we don't use _safepoint_id to track safepoint state in later JDK
>> we have pretty numbers in logs and JFR.
>>
>> Thanks, Robbin
>>
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>> Issue:
>>>> https://bugs.openjdk.java.net/browse/JDK-8225702
>>>>
>>>> Passes JFR test and t1-3.
>>>>
>>>> Thanks, Robbin


More information about the hotspot-jfr-dev mailing list