RFR(XS): 8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max

David Holmes david.holmes at oracle.com
Tue Jun 26 05:41:32 UTC 2018


On 26/06/2018 11:32 AM, Daniel D. Daugherty wrote:
> David,
> 
> Thanks for your eagle eyes! More below...
> 
> 
> On 6/25/18 9:27 AM, Daniel D. Daugherty wrote:
>> On 6/25/18 1:57 AM, David Holmes wrote:
>>> Hi Dan,
>>>
>>> On 21/06/2018 10:18 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a fix for a recent (very rare) Thread-SMR related test failure.
>>>>
>>>> Since the fix is related to the ErrorHandling tests and affects 
>>>> hs_err_pid
>>>> file generation, this code review is being sent to both the Runtime and
>>>> the Serviceability teams. Please make sure you reply-all to any 
>>>> responses
>>>> so we have complete review threads on both aliases.
>>>>
>>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8205195
>>>>
>>>> Webrev URL: 
>>>> http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/
>>>>
>>>> The bug itself contains analysis about the root cause of the bug and
>>>> the comment updates to the code describes the no win scenario that the
>>>> hs_err_pid file generation code is in. Of course, I also have a comment
>>>> where I was able to harden the ErrorHandling tests. I did manage to
>>>> resist the urge to mention the "Kobiyashi Maru" [1] in the new 
>>>> comments.
>>>>
>>>> Testing: Mach5 
>>>> builds-tier1,jdk-tier1,jdk-tier2,hs-tier1,hs-tier2,hs-tier3
>>>>           on the usual Oracle platforms.
>>>>
>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> I don't quite follow the fix.
>>
>> I think this comment explains what the fix is _trying_ to do:
>>
>> 1702   // We grab Threads_lock to keep ThreadsSMRSupport::print_info_on()
>> 1703   // from racing with Threads::add() or Threads::remove() as we
>> 1704   // generate the hs_err_pid file. This makes our ErrorHandling 
>> tests
>> 1705   // more stable.
>>
>>
>>> Won't you self-deadlock on acquiring the Threads_lock in the 
>>> secondary error handler test, due to the recursive call to 
>>> controlled_crash ?
>>
>> I missed that possibility, but now I'm puzzled why my testing didn't
>> reveal this situation. We have a test for secondary error handling
>> and it should have self-deadlocked (and failed). I'll investigate.
> 
> The ErrorHandling tests don't run with 'release' bits so no self-deadlock.
> The 'fastdebug' and 'slowdebug' versions will fail an assertion that gets
> hidden by the fact that it is a secondary failure mode.

Ah - of course.

> I've filed the following bug:
> 
> JDK-8205648 fix for 8205195 breaks secondary error handling
> https://bugs.openjdk.java.net/browse/JDK-8205648

Thanks,
David

> Dan
> 
> 
>>
>> Dan
>>
>>
>>>
>>> David
>>>
>>>> Dan
>>>>
>>>> [1] https://www.urbandictionary.com/define.php?term=Kobayashi%20Maru
>>>>
>>
>>
> 


More information about the hotspot-runtime-dev mailing list