RFR 8064694: Kitchensink: WaitForMultipleObjects failed in hotspot\src\os\windows\vm\os_windows.cpp: 3844
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Nov 18 02:05:30 UTC 2014
Ivan,
I spoke too soon.
There's a review comment from Markus G that hasn't been addressed.
We need to see if you've convinced Markus in addition to David H.
Dan
P.S.
Look for Markus' reply to David H's e-mail; it not in this
fork of the review thread...
On 11/17/14 7:01 PM, Daniel D. Daugherty wrote:
> Ivan,
>
> Please coordinate with Staffan Larsen about when he is planning to
> take this week's snapshot of JDK9-hs-rt (RT_Baseline). Please push
> your fix after Staffan's snapshot so we can have a week of soak
> time for this version of the fix...
>
> Dan
>
>
> On 11/17/14 6:50 PM, David Holmes wrote:
>> Hi Ivan,
>>
>> On 17/11/2014 7:00 PM, Ivan Gerasimov wrote:
>>> Thanks David!
>>>
>>> On 17.11.2014 9:40, David Holmes wrote:
>>>> On 17/11/2014 7:23 AM, Ivan Gerasimov wrote:
>>>>> Thank you Daniel!
>>>>>
>>>>> Please find the updated webrev with your suggestions incorporated
>>>>> here:
>>>>> http://cr.openjdk.java.net/~igerasim/8064694/1/webrev/
>>>>>
>>>>> Concerning the thread priority: If the application is of
>>>>> NORMAL_PRIORITY_CLASS, then setting the thread's priority level to
>>>>> THREAD_PRIORITY_HIGHEST will result in its priority value to be
>>>>> only 10
>>>>> (of maximum 31).
>>>>> http://msdn.microsoft.com/en-us/library/windows/desktop/ms685100(v=vs.85).aspx
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> And if the process is HIGH_PRIORITY_CLASS, then the tread with the
>>>>> HIGHEST priority level will have priority value == 15 of 31.
>>>>>
>>>>> I believe, it should not be too much, and the machine will not become
>>>>> busy with only those closing threads.
>>>>> However, I hope it would be enough to make them complete faster than
>>>>> other threads of the NORMAL priority level withing the same
>>>>> application.
>>>>
>>>> I don't think this is necessary or desirable. Under normal usage we're
>>>> giving priority to exiting threads and that may disrupt the usual
>>>> scheduling patterns that applications see. You may posit that it is
>>>> "harmless" but we can't say that for sure. Nor can we actually know
>>>> that this will help with this particular bug. I would not add in this
>>>> new code.
>>>>
>>>
>>> There are two places where I put adjusting the thread's priority:
>>>
>>> 1) We've the array of handles filled up.
>>>
>>> If we're found in this code branch, it'll mean that unfortunately we've
>>> already got broken exit pattern, because the current thread has to do a
>>> blocking call, having the ownership of a critical section.
>>> The full array of handles means that many threads are exiting at that
>>> time, thus all the threads that are starting to exit after the current
>>> one will block at the attempt to grab ownership of the critical
>>> section.
>>>
>>> Raising the priority of one thread that had already reached
>>> _endthreadex(), seems appropriate to me in such a situation, because it
>>> helps shorten the period of time when the threads remain blocked.
>>>
>>> Choosing the oldest exiting thread ensures that the period of time when
>>> the priority of one thread is higher is the smallest possible.
>>>
>>> 2) The process exit branch.
>>>
>>> That's the main part of the fix -- here we make the process to wait for
>>> all the threads having called _endthreadex() to complete, at the same
>>> time preventing any other threads from starting the exiting procedure.
>>> The execution flow is already changed here (I don't want to say
>>> disrupted, because it was meant to fix the issue).
>>>
>>> All running threads are about to be terminated soon by ending the
>>> process, so raising the priority of some of the threads should not have
>>> any bad impact on the program flow.
>>> Instead, it may make the time the process has to wait before calling
>>> exit() shorter.
>>>
>>>
>>> I can surely remove that playing with the threads' priority, as it's
>>> not
>>> the essential part of the fix.
>>> However, I think it's a useful hint to the scheduler, which can improve
>>> things in some situations, and I'm not really sure how it can harm.
>>
>> Okay. You've convinced me. I'm okay with the priority changes to try
>> to minimize the exit time blocking.
>>
>> Thanks,
>> David
>>
>>>
>>> Sincerely yours,
>>> Ivan
>>>
>>>
>>>> David
>>>>
>>>>> Sincerely yours,
>>>>> Ivan
>>>>>
>>>>>
>>>>> On 15.11.2014 2:22, Daniel D. Daugherty wrote:
>>>>>> On 11/14/14 5:35 AM, Ivan Gerasimov wrote:
>>>>>>> Hello!
>>>>>>>
>>>>>>> The recent fix for JDK-8059533 ((process) Make exiting process wait
>>>>>>> for exiting threads [win]) caused the warning message to be printed
>>>>>>> in some test environments:
>>>>>>> -----------
>>>>>>> os_windows.cpp:3844 is in the newly updated
>>>>>>> os::win32::exit_process_or_thread(Ept what, int exit_code)
>>>>>>> -----------
>>>>>>>
>>>>>>> This has been observed with debug builds on highly loaded systems.
>>>>>>>
>>>>>>>
>>>>>>> To address the issue it is proposed to do three things:
>>>>>>> 1) increase the timeout for debug builds,
>>>>>>> 2) increase the maximum number of the thread handles to be stored,
>>>>>>> 3) rise the priority of the exiting threads, if we need to wait for
>>>>>>> them.
>>>>>>>
>>>>>>> Would you please help review the fix?
>>>>>>>
>>>>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8064694
>>>>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8064694/0/webrev/
>>>>>>
>>>>>> src/os/windows/vm/os_windows.cpp
>>>>>>
>>>>>> line 3784: #define MAX_EXIT_HANDLES NOT_DEBUG(32) DEBUG_ONLY(128)
>>>>>> Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>> Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>>
>>>>>> That uses the smaller value for only one build config (PRODUCT).
>>>>>>
>>>>>> line 3785: #define EXIT_TIMEOUT NOT_DEBUG(1000)
>>>>>> DEBUG_ONLY(4000)
>>>>>> /*1 sec in product, 4 sec in debug*/
>>>>>> Instead of NOT_DEBUG can you use PRODUCT_ONLY?
>>>>>> Instead of DEBUG_ONLY can you used NOT_PRODUCT?
>>>>>> Please add spaces between the comment delimiters and the comment
>>>>>> text.
>>>>>>
>>>>>> That uses the smaller timeout for only one build config
>>>>>> (PRODUCT).
>>>>>>
>>>>>> line 3836 // Rise the priority...
>>>>>> Typo: 'Rise' -> 'Raise'
>>>>>>
>>>>>> About the general idea of raising the exiting thread's priority,
>>>>>> if the exiting thread is looping in some Win* OS code after this
>>>>>> point, will raising the priority make the machine unusable?
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> The fix was tested on all available platforms, with the hotspot
>>>>>>> testset. No failures.
>>>>>>>
>>>>>>> Sincerely yours,
>>>>>>> Ivan
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>
>
>
>
More information about the hotspot-runtime-dev
mailing list