RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Nov 18 02:01:18 UTC 2014
Ivan,
Please coordinate with Staffan Larsen about when he is planning to
take this week's snapshot of JDK9-hs-rt (RT_Baseline). Please push
your fix after Staffan's snapshot so we can have a week of soak
time for this version of the fix...
Dan
On 11/17/14 6:50 PM, David Holmes wrote:
> Hi Ivan,
>
> On 17/11/2014 7:00 PM, Ivan Gerasimov wrote:
>> Thanks David!
>>
>> On 17.11.2014 9:40, David Holmes wrote:
>>> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>>>> Thank you Daniel!
>>>>
>>>> Please find the updated webrev with your suggestions incorporated
>>>> here:
>>>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>>>
>>>> Concerning the thread priority: If the application is of
>>>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>>>> THREAD_PRIORITY_HIGHEST will result in its priority value to be
>>>> only 10
>>>> (of maximum 31).
>>>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx
>>>>
>>>>
>>>>
>>>>
>>>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>>>> HIGHEST priority level will have priority value == 15 of 31.
>>>>
>>>> I believe, it should not be too much, and the machine will not become
>>>> busy with only those closing threads.
>>>> However, I hope it would be enough to make them complete faster than
>>>> other threads of the NORMAL priority level withing the same
>>>> application.
>>>
>>> I don't think this is necessary or desirable. Under normal usage we're
>>> giving priority to exiting threads and that may disrupt the usual
>>> scheduling patterns that applications see. You may posit that it is
>>> "harmless" but we can't say that for sure. Nor can we actually know
>>> that this will help with this particular bug. I would not add in this
>>> new code.
>>>
>>
>> There are two places where I put adjusting the thread's priority:
>>
>> 1) We've the array of handles filled up.
>>
>> If we're found in this code branch, it'll mean that unfortunately we've
>> already got broken exit pattern, because the current thread has to do a
>> blocking call, having the ownership of a critical section.
>> The full array of handles means that many threads are exiting at that
>> time, thus all the threads that are starting to exit after the current
>> one will block at the attempt to grab ownership of the critical section.
>>
>> Raising the priority of one thread that had already reached
>> _endthreadex(), seems appropriate to me in such a situation, because it
>> helps shorten the period of time when the threads remain blocked.
>>
>> Choosing the oldest exiting thread ensures that the period of time when
>> the priority of one thread is higher is the smallest possible.
>>
>> 2) The process exit branch.
>>
>> That's the main part of the fix -- here we make the process to wait for
>> all the threads having called _endthreadex() to complete, at the same
>> time preventing any other threads from starting the exiting procedure.
>> The execution flow is already changed here (I don't want to say
>> disrupted, because it was meant to fix the issue).
>>
>> All running threads are about to be terminated soon by ending the
>> process, so raising the priority of some of the threads should not have
>> any bad impact on the program flow.
>> Instead, it may make the time the process has to wait before calling
>> exit() shorter.
>>
>>
>> I can surely remove that playing with the threads' priority, as it's not
>> the essential part of the fix.
>> However, I think it's a useful hint to the scheduler, which can improve
>> things in some situations, and I'm not really sure how it can harm.
>
> Okay. You've convinced me. I'm okay with the priority changes to try
> to minimize the exit time blocking.
>
> Thanks,
> David
>
>>
>> Sincerely yours,
>> Ivan
>>
>>
>>> David
>>>
>>>> Sincerely yours,
>>>> Ivan
>>>>
>>>>
>>>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>>>> Hello!
>>>>>>
>>>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>>>> for exiting threads [win]) caused the warning message to be printed
>>>>>> in some test environments:
>>>>>> -----------
>>>>>> os_windows.cpp:3844 is in the newly updated
>>>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>>>> -----------
>>>>>>
>>>>>> This has been observed with debug builds on highly loaded systems.
>>>>>>
>>>>>>
>>>>>> To address the issue it is proposed to do three things:
>>>>>> 1) increase the timeout for debug builds,
>>>>>> 2) increase the maximum number of the thread handles to be stored,
>>>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>>>> them.
>>>>>>
>>>>>> Would you please help review the fix?
>>>>>>
>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>>>
>>>>> src/os/windows/vm/os_windows.cpp
>>>>>
>>>>> line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>>> Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>> Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>
>>>>> That uses the smaller value for only one build config (PRODUCT).
>>>>>
>>>>> line 3785: #define EXIT_TIMEOUT NOT_DEBUG(1000)
>>>>> DEBUG_ONLY(4000)
>>>>> /*1 sec in product, 4 sec in debug*/
>>>>> Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>> Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>> Please add spaces between the comment delimiters and the comment
>>>>> text.
>>>>>
>>>>> That uses the smaller timeout for only one build config
>>>>> (PRODUCT).
>>>>>
>>>>> line 3836 // Rise the priority...
>>>>> Typo: 'Rise' -> 'Raise'
>>>>>
>>>>> About the general idea of raising the exiting thread's priority,
>>>>> if the exiting thread is looping in some Win* OS code after this
>>>>> point, will raising the priority make the machine unusable?
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> The fix was tested on all available platforms, with the hotspot
>>>>>> testset. No failures.
>>>>>>
>>>>>> Sincerely yours,
>>>>>> Ivan
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
More information about the serviceability-dev
mailing list