RFR(XS): 8205195 NestedThreadsListHandleInErrorHandlingTest fails because hs_err doesn't contain _nested_thread_list_max

Daniel D. Daugherty daniel.daugherty at oracle.com
Fri Jun 22 00:06:56 UTC 2018


May I please have a second reviewer on this one...

Dan


On 6/21/18 8:57 AM, Daniel D. Daugherty wrote:
> Thomas,
>
> Thanks for the quick review.
>
>
> On 6/21/18 2:53 AM, Thomas Stüfe wrote:
>> Hi Daniel,
>>
>> yes, that is annoying.
>>
>> I am okay with your fix, if you want to push it in this form.
>
> Thanks.
>
>
>> But preparing the test crash in this way feels weird since the whole
>> point of this exercise is to test error handling in close-to-real
>> scenarios... but it sure does not hurt in this case.
>
> Yup. This is definitely weird, but my goal is to reduce testing noise.
> I do acknowledge that the real world use of error reporting may run
> into this failure mode which will suppress a section of hs_err_pid
> output.
>
>
>> Also, note that at different places we decide differently, see e.g.
>> the "printing heap information" STEP - we omit locking Heap_lock in
>> VMError::report() and only lock it in VMError::print_vm_info() (where
>> we have no secondary signal handling and must not crash). So, in that
>> case we are okay with risking a secondary crash in error handling.
>> Probably there are just no regression tests for the heap information
>> printout whose intermittent fails could annoy us :)
>
> Yup. I recognized that when I wrote the Thread-SMR tests I was making
> them picky enough to possible run into failure modes we never would
> have detected before.
>
>
>> My feeling is that I would like to see a solution at the test
>> framework side. Maybe, if a test is marked as "may fail rarely and
>> thats okay", the test framework could retry the test and only fail if
>> the error happens again.
>
> We currently don't have a way of tagging a test like that and I'm
> not convinced that I would really want us to do that. However, this
> particular bug truly falls into a no win scenario and that's a
> different situation than I've encountered before.
>
> Again, thanks for the review.
>
> Dan
>
>
>>
>> Thanks, Thomas
>>
>>
>>
>> On Thu, Jun 21, 2018 at 2:18 AM, Daniel D. Daugherty
>> <daniel.daugherty at oracle.com> wrote:
>>> Greetings,
>>>
>>> I have a fix for a recent (very rare) Thread-SMR related test failure.
>>>
>>> Since the fix is related to the ErrorHandling tests and affects 
>>> hs_err_pid
>>> file generation, this code review is being sent to both the Runtime and
>>> the Serviceability teams. Please make sure you reply-all to any 
>>> responses
>>> so we have complete review threads on both aliases.
>>>
>>> Bug URL: https://bugs.openjdk.java.net/browse/JDK-8205195
>>>
>>> Webrev URL: 
>>> http://cr.openjdk.java.net/~dcubed/8205195-webrev/0-for-jdk-jdk/
>>>
>>> The bug itself contains analysis about the root cause of the bug and
>>> the comment updates to the code describes the no win scenario that the
>>> hs_err_pid file generation code is in. Of course, I also have a comment
>>> where I was able to harden the ErrorHandling tests. I did manage to
>>> resist the urge to mention the "Kobiyashi Maru" [1] in the new 
>>> comments.
>>>
>>> Testing: Mach5 
>>> builds-tier1,jdk-tier1,jdk-tier2,hs-tier1,hs-tier2,hs-tier3
>>>           on the usual Oracle platforms.
>>>
>>> Thanks, in advance, for any comments, questions or suggestions.
>>>
>>> Dan
>>>
>>> [1] https://www.urbandictionary.com/define.php?term=Kobayashi%20Maru
>>>
>
>



More information about the serviceability-dev mailing list