RFR(XS) 8036823: Stack trace sometimes shows 'locked' instead of 'waiting to lock'
Daniel D. Daugherty
daniel.daugherty at oracle.com
Thu Jun 12 02:14:02 UTC 2014
Thanks for the very fast review!
Would you be OK if I just went with what I have since I've tested
that thoroughly? I've burned quite a few cycles on the original bug
and this test and I'd like to get back to my primary task...
Dan
On 6/11/14 7:55 PM, David Holmes wrote:
> Hi Dan,
>
> My only nit would be to use a CountDownLatch rather than roll your own
> via wait/notify :)
>
> Cheers,
> David
>
> On 12/06/2014 8:36 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> Let's try this hopefully one last time:
>>
>> http://cr.openjdk.java.net/~dcubed/8046287-webrev/1-jdk9-hs-rt/
>> https://bugs.openjdk.java.net/browse/JDK-8046287
>>
>> Changes relative to the ORIGINAL version of the test:
>>
>> - added a new header waiting pattern to catch the case where
>> the target thread waiting on a condition (like a VM op)
>> - add synchronization to the start-up of the contending threads
>> so that we don't start sampling while the contending threads
>> are initializing
>> - add sanity check for observing only two "ContendingThread-*"
>> stack traces
>>
>> - rename some variables to make their use more clear
>> - update/add various comments
>> - add counters for the various checks and report a summary
>> of all the sampling runs
>> - issue a warning if the specific scenario encountered by
>> the original bug (8036823) is never seen
>>
>> Testing:
>>
>> - JPRT test run of the test using product and fastdebug
>> bits on all the usual platforms
>>
>> - 3600 sample run with fastdebug bits:
>> INFO: Summary for all samples:
>> INFO: both_running_cnt=0
>> INFO: both_waiting_cnt=0
>> INFO: contended_cnt=2005
>> INFO: one_waiting_cnt=1405
>> INFO: uncontended_cnt=190
>>
>> - 3600 sample run with fastdebug bits w/ -Xcomp:
>> INFO: Summary for all samples:
>> INFO: both_running_cnt=0
>> INFO: both_waiting_cnt=0
>> INFO: contended_cnt=1867
>> INFO: one_waiting_cnt=1548
>> INFO: uncontended_cnt=185
>>
>> - 3600 sample run with fastdebug bits w/ -Xcomp -XX:+DeoptimizeALot:
>> INFO: Summary for all samples:
>> INFO: both_running_cnt=46
>> INFO: both_waiting_cnt=0
>> INFO: contended_cnt=3135
>> INFO: one_waiting_cnt=3
>> INFO: uncontended_cnt=416
>>
>> The contended_cnt is where we're hitting the original
>> bug's scenario and we've got great coverage there.
>> The other counts reflect how often we hit the edge
>> cases...
>>
>> Thanks, in advance, for any comments, questions or suggestions.
>>
>> Dan
>>
>>
>> On 6/9/14 10:04 PM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> Nightly testing has revealed a bug in the test that reproduces
>>> nicely when these options are used: -Xcomp -XX:+DeoptimizeALot
>>>
>>> Here's the webrev URL for the minor tweak to catch yet more
>>> variation of the waiting pattern:
>>>
>>> http://cr.openjdk.java.net/~dcubed/8046287-webrev/0-jdk9-hs-rt/
>>>
>>> Thanks to Vladimir K for reporting the test failure and for
>>> providing the right details in the bug report.
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>>
>>> On 5/29/14 8:49 AM, Daniel D. Daugherty wrote:
>>>> One more round of review after refactoring the test based on comments
>>>> from David H and Serguei.
>>>>
>>>> Here's the webrev for this round:
>>>>
>>>> http://cr.openjdk.java.net/~dcubed/8036823-webrev/2-jdk9-hs-rt/
>>>>
>>>> Had to change the default sample size from 30 -> 15 in order to
>>>> get the test to pass reliably on Solaris SPARC JPRT machines.
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>
>>>> Dan
>>>>
>>>>
>>>> On 5/22/14 10:18 PM, Daniel D. Daugherty wrote:
>>>>> Zhengyu is tied up with some other work so I've taken on this fix.
>>>>>
>>>>> Here's the webrev URL for the next round:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8036823-webrev/1-jdk9-hs-rt/
>>>>>
>>>>> The fix has been tested with vm.quick on all Aurora Adhoc platforms.
>>>>> The new test has been run with the fix via JPRT and passes on all
>>>>> JPRT platforms. The new test has also been run without the fix and
>>>>> fails on most platforms. Since the default sample size is just 30,
>>>>> it is possible to get 30 runs in a row without failing.
>>>>>
>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>> On 5/19/14 7:58 AM, Zhengyu Gu wrote:
>>>>>> This is a simple fix for incorrect lock state.
>>>>>>
>>>>>> The timing on setting thread's pending monitor can result stack
>>>>>> trace dump reporting incorrect lock state. The solution is to check
>>>>>> the monitor's owner, if the owner is other than the current thread,
>>>>>> then the monitor, is or is in process of being, set the pending
>>>>>> monitor of current thread.
>>>>>>
>>>>>>
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8036823
>>>>>> Webrev: http://cr.openjdk.java.net/~zgu/8036823/webrev.00/
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> -Zhengyu
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
More information about the hotspot-dev
mailing list