RFR(S): 8200374: Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify threads_do()
Daniel D. Daugherty
daniel.daugherty at oracle.com
Thu Apr 12 12:41:42 UTC 2018
On 4/12/18 6:06 AM, Erik Österlund wrote:
> Hi Dan,
>
> Looks good.
Thanks!
Dan
>
> Thanks,
> /Erik
>
> On 2018-04-11 17:20, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> Sorry for the delay in getting back to this thread... I'm traveling so
>> things take longer than usual...
>>
>> I would love to get a final re-review from David H, Erik O, and Robbin.
>>
>> Very little has changed on the OpenJDK side of this fix. I added an
>> assert() that caught a sampler thread life-cycle issue and I'm using
>> OrderAccess::release_store() for proper matching with load_acquire().
>>
>> Here's the incremental webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/3_for_jdk_hs_open.inc/
>>
>> And here's the full webrev:
>>
>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/3_for_jdk_hs_open.full/
>>
>>
>> I've done three Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3]
>> runs
>> and there are no test failures related to these changes. I just
>> kicked of
>> one more Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run on the
>> latest rebased bits...
>>
>> My Solaris-X64 server is finishing the third round of Thread-SMR stress
>> testing on this version. There are no test failures related to these
>> changes (so far)...
>>
>> Thanks, in advance, for any comments, suggestions, or feedback.
>>
>> Dan
>>
>> Dan
>>
>>
>> On 4/3/18 5:25 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> This fix has been revised due to feedback from David H. and Erik O.
>>>
>>> Here's the incremental webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/2_for_jdk_hs_open.inc/
>>>
>>>
>>> And here's the full webrev:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/2_for_jdk_hs_open.full/
>>>
>>>
>>> Summary of the changes relative to 'cr1':
>>>
>>> - Rename TracingExport::get_sampler_thread() to
>>> sampler_thread_acquire()
>>> and TracingExport::set_sampler_thread() to
>>> set_sampler_thread_with_lock().
>>> - set_sampler_thread_with_lock() uses Threads_lock to avoid
>>> conflicts with
>>> Thread-SMR scans.
>>>
>>> The Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run has
>>> finished
>>> and there are no test failures related to these changes.
>>>
>>> My Solaris-X64 server is finishing the third round of Thread-SMR stress
>>> testing on the 'cr1' version. I'll start a round of stress testing for
>>> this version ('cr2') as soon as I can.
>>>
>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>
>>> Dan
>>>
>>> On 4/2/18 11:11 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> This fix has been revised due to additional testing and due to
>>>> feedback from David H.
>>>>
>>>> Here's the incremental webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/1_for_jdk_hs_open.inc/
>>>>
>>>>
>>>> And here's the full webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/1_for_jdk_hs_open.full/
>>>>
>>>>
>>>> My usual Thread-SMR stress testing on my Solaris-X64 is still running;
>>>> so far there has been only one unrelated intermittent test failure.
>>>> The
>>>> Mach5 builds-tier1, jdk-tier[1-3], and hs-tier[1-3] run is still in
>>>> process...
>>>>
>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 3/28/18 9:52 PM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have a (mostly) small enhancement for Thread-SMR:
>>>>>
>>>>> JDK-8200374 Add ThreadsSMRSupport::verify_hazard_pointer_scanned()
>>>>> to verify threads_do()
>>>>> https://bugs.openjdk.java.net/browse/JDK-8200374
>>>>>
>>>>> Here's the webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8200374-webrev/0_for_jdk_hs_open/
>>>>>
>>>>> This is Erik O's improvement on the assertion added by the following
>>>>> bug fix (with some minor tweaking done by me):
>>>>>
>>>>> JDK-8199813 SIGSEGV in ThreadsList::includes()
>>>>> https://bugs.openjdk.java.net/browse/JDK-8199813
>>>>>
>>>>> Summary of the changes:
>>>>>
>>>>> - Replace the assertion that I added in JDK-8199813 with a closure
>>>>> based function that verifies the threads_do() contract depended
>>>>> on by Thread-SMR.
>>>>> - Add ThreadsSMRSupport::verify_hazard_pointer_scanned() to verify
>>>>> that the calling thread's hazard pointer is scanned by
>>>>> threads_do().
>>>>> The new function is called from both ThreadsSMRSupport::acquire_*()
>>>>> functions.
>>>>> - Refactor the non-JavaThread part of Threads::threads_do() into
>>>>> Threads::non_java_threads_do() so that the non-JavaThread part
>>>>> can also be called by other threads_do() functions. Yes, the
>>>>> Threads::threads_do() contract is still to scan every thread in
>>>>> the system.
>>>>> - Add hooks for a "tracing sampler thread" to be optionally scanned
>>>>> by Threads::non_java_threads_do().
>>>>>
>>>>> This fix has gone thru a couple of Mach5 builds-tier1, jdk-tier[1-3],
>>>>> and hs-tier[1-3] runs. I've also started my usual 24+ hour Thread-SMR
>>>>> stress testing run on my Solaris-X64 server.
>>>>>
>>>>> Just to be extra sure, I backed out the fix from JDK-8199813 to
>>>>> JavaThread::verify_not_published() (which started this round of
>>>>> Thread-SMR fixes) and we catch the issue in the new function
>>>>> ThreadsSMRSupport::verify_hazard_pointer_scanned().
>>>>>
>>>>> Thanks, in advance, for any comments, suggestions, or feedback.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list