RFR: 8225702(13): Safepoint counter can't be used for safepoint detection
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Jun 25 19:08:49 UTC 2019
On 6/25/19 2:49 PM, Robbin Ehn wrote:
> Hi David, (and Dan)
>
>>> But since this is JDK 13 bug I did the simplest thing.
>>
>> Yes this is best. To summarise what you and I discussed:
>> - older usage of safepoint_counter only does "trivial" actions
>> between reading the counts
>> - the new usage allows more elaborate code between the reading of the
>> counts and so hits the current problem
>> - the new global safepoint-id will fix that code and not disrupt the
>> other safepoint-counter users.
>>
>> Okay I'm fine with changes and your explanations.
>
> Thanks!
>
> David and Dan, here is v2 update from Dan's comments.
> Full: http://cr.openjdk.java.net/~rehn/8225702/v2/full/
> Inc : http://cr.openjdk.java.net/~rehn/8225702/v2/inc/
src/hotspot/share/runtime/safepoint.hpp
No comments.
src/hotspot/share/runtime/safepoint.cpp
No comments.
Thumbs up!
Dan
>
> Passed t1-3 on jdk13.
>
> Thanks, Robbin
>
>>
>> Thanks,
>> David
>> -----
>>
>>>>
>>>>> Also it makes no sense for a normal user that JFR/JMC safepoint id
>>>>> are only odd,
>>>>> we instead use this new id which is just incremented for each
>>>>> safepoint.
>>>>>
>>>>> Code:
>>>>> http://cr.openjdk.java.net/~rehn/8225702/v1/webrev/index.html
>>>>
>>>> src/hotspot/share/runtime/safepoint.hpp
>>>>
>>>> Why did you make thread_not_running an instance method?
>>>
>>> Since it reads the _safepoint_counter now that I made it private it
>>> needed access.
>>>
>>>>
>>>> src/hotspot/share/runtime/safepoint.cpp
>>>>
>>>> 418 // Record state
>>>> 419 _state = _synchronized;
>>>> 420
>>>> 421 OrderAccess::fence();
>>>> 422
>>>> 423 // Set the new id
>>>> 424 ++_safepoint_id;
>>>>
>>>> Under the assumption that the threads/code that uses the new
>>>> SafepointStateTracker do not participate in the safepoint protocol,
>>>> is there a risk of seeing a stale _safepoint_id, or racing with the
>>>> increment? It's not clear to me under what conditions the
>>>> SafepointStateTracker::safepoint_state_changed() calls will be made.
>>>
>>> We write as:
>>> state = synch
>>> id++
>>> state = unsynch // read state here
>>> state = synch
>>> id++ // read id here
>>> state = unsynch // re-read state here
>>> // re-read id here
>>>
>>> This would give a false negative, so it would not work for a JT in
>>> native/blocked or a NonJavaThread. With doing a 'stable load' thus
>>> reading id
>>> twice. So we should assert this.
>>>
>>> Encoding state and counter in the same primitive which can be
>>> atomically
>>> updated, so we never see the same value in a long enough period as
>>> you suggested
>>> fixes this. I believe doing the same for JavaThread state can also
>>> simplify
>>> somethings.
>>>
>>> But I don't think we should do that as a bug fix for 13.
>>> And I was going to use the safepoint id in logging as gc does.
>>> So even if we don't use _safepoint_id to track safepoint state in
>>> later JDK
>>> we have pretty numbers in logs and JFR.
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Thanks,
>>>> David
>>>> -----
>>>>
>>>>> Issue:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8225702
>>>>>
>>>>> Passes JFR test and t1-3.
>>>>>
>>>>> Thanks, Robbin
More information about the hotspot-jfr-dev
mailing list