RFR (S) 8049304: race between VM_Exit and _sync_FutileWakeups->inc()

Wed Sep 2 08:54:50 UTC 2015

Hi Dan,

On 2/09/2015 2:45 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> I've updated the "fix" for this bug based on code review comments
> received in round 1. I've dropped most of the changes from round 1
> with a couple of exceptions.

I have no further comments - it all looks good to me. If others want the 
pendulum to swing back a little from this position then ... nothing that 
has been suggested is functionally wrong. :)

Thanks,
David
-----

PS. When you get back from vacation I'll be gone for a month. That gives 
you a large window to push other things through with less stress ;-)

> JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc()
> https://bugs.openjdk.java.net/browse/JDK-8049304
>
> Webrev URL: http://cr.openjdk.java.net/~dcubed/8049304-webrev/2-jdk9-hs-rt/
>
> The easiest way to re-review is to download the two patch files
> (round 0 and round 2) and view them in your favorite file merge tool:
>
> http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/hotspot.patch
>
> http://cr.openjdk.java.net/~dcubed/8049304-webrev/2-jdk9-hs-rt/hotspot.patch
>
>
> Testing: Aurora Adhoc RT-SVC nightly batch (in process)
>           Aurora Adhoc vm.tmtools batch (in process)
>           Kim's repro sequence for JDK-8049304
>           Kim's repro sequence for JDK-8129978
>           JPRT -testset hotspot
>
> Changes between round 0 and round 2:
>
> - clarify a few comments
> - init _has_PerfData flag with '0' (instead of false)
> - drop unnecessary use OrderAccess::release_store() to set
>    _has_PerfData to '1' (we're in a Mutex)
> - change perfMemory_exit() to only call PerfDataManager::destroy()
>    when called at a safepoint and when the StatSampler is not
>    running; this means when the VM is aborting, we no longer have
>    a race between the original crash report and this code path.
>
> Changes between round 1 and round 2:
>
> - clarify a few comments
> - drop is_safe parameter to OM_PERFDATA_OP macro
> - init _has_PerfData flag with '0' (instead of false)
> - drop OrderAccess::fence() call before os::naked_short_sleep() call
> - drop PerfDataManager::has_PerfData_with_acquire()
> - drop unnecessary use OrderAccess::release_store() to set
>    _has_PerfData to '1' (we're in a Mutex)
>
> I believe that I've addressed all comments from round 0 and
> from round 1.
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
> On 8/31/15 4:51 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I've updated the "fix" for this bug based on code review comments
>> received in round 0.
>>
>> JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc()
>> https://bugs.openjdk.java.net/browse/JDK-8049304
>>
>> Webrev URL:
>> http://cr.openjdk.java.net/~dcubed/8049304-webrev/1-jdk9-hs-rt/
>>
>> The easiest way to re-review is to download the two patch files
>> and view them in your favorite file merge tool:
>>
>> http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/hotspot.patch
>>
>> http://cr.openjdk.java.net/~dcubed/8049304-webrev/1-jdk9-hs-rt/hotspot.patch
>>
>>
>> Testing: Aurora Adhoc RT-SVC nightly batch (in process)
>>          Aurora Adhoc vm.tmtools batch (in process)
>>          Kim's repro sequence for JDK-8049304
>>          Kim's repro sequence for JDK-8129978
>>          JPRT -testset hotspot
>>
>> Changes between round 0 and round 1:
>>
>> - add an 'is_safe' parameter to the OM_PERFDATA_OP macro;
>>   safepoint-safe callers can access _has_PerfData flag directly;
>>   non-safepoint-safe callers use a load-acquire to fetch the
>>   current _has_PerfData flag value
>> - change PerfDataManager::destroy() to simply set _has_PerfData
>>   to zero (field is volatile) and then use a fence() to prevent
>>   any reordering of operations in any direction; it's only done
>>   once during VM shutdown so...
>> - change perfMemory_exit() to only call PerfDataManager::destroy()
>>   when called at a safepoint and when the StatSample is not
>>   running; this means when the VM is aborting, we no longer have
>>   a race between the original crash report and this code path.
>>
>> I believe that I've addressed all comments from round 0.
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 8/25/15 3:08 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a "fix" for a long standing race between JVM shutdown and the
>>> JVM statistics subsystem:
>>>
>>> JDK-8049304 race between VM_Exit and _sync_FutileWakeups->inc()
>>> https://bugs.openjdk.java.net/browse/JDK-8049304
>>>
>>> Webrev URL:
>>> http://cr.openjdk.java.net/~dcubed/8049304-webrev/0-jdk9-hs-rt/
>>>
>>> Testing: Aurora Adhoc RT-SVC nightly batch
>>>          Aurora Adhoc vm.tmtools batch
>>>          Kim's repro sequence for JDK-8049304
>>>          Kim's repro sequence for JDK-8129978
>>>          JPRT -testset hotspot
>>>
>>> This "fix":
>>>
>>> - adds a volatile flag to record whether PerfDataManager is holding
>>>   data (PerfData objects)
>>> - adds PerfDataManager::has_PerfData() to return the flag
>>> - changes the Java monitor subsystem's use of PerfData to
>>>   check both allocation of the monitor subsystem specific
>>>   PerfData object and the new PerfDataManager::has_PerfData()
>>>   return value
>>>
>>> If the global 'UsePerfData' option is false, the system works as
>>> it did before. If 'UsePerfData' is true (the default on non-embedded
>>> systems), the Java monitor subsystem will allocate a number of
>>> PerfData objects to record information. The objects will record
>>> information about Java monitor subsystem until the JVM shuts down.
>>>
>>> When the JVM starts to shutdown, the new PerfDataManager flag will
>>> change to false and the Java monitor subsystem will stop using the
>>> PerfData objects. This is the new behavior. As noted in the comments
>>> I added to the code, the race is still present; I'm just changing
>>> the order and the timing to reduce the likelihood of the crash.
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>>
>>
>>
>>
>