RFR 8145127: VM warning: WaitForMultipleObjects timed out (0) ...

David Holmes david.holmes at oracle.com
Tue Dec 15 12:07:21 UTC 2015


On 15/12/2015 10:00 PM, Ivan Gerasimov wrote:
>
>
> On 15.12.2015 9:31, David Holmes wrote:
>> I really wish we had some view inside windows to see exactly what is
>> going wrong here. :(
>>
>
> Yes, so do I!
> It would be really helpful to know exactly how this race bug is
> "implemented".
> Or, at least, how to deal with it to avoid the return code replacement
> for sure.
>
>> On 15/12/2015 12:09 AM, Ivan Gerasimov wrote:
>>> Hello!
>>>
>>> There was a timeout observed in os_windows.cpp at the line
>>> 3945           res = WaitForMultipleObjects(MAXIMUM_WAIT_OBJECTS,
>>> handles, FALSE, EXIT_TIMEOUT);
>>>
>>> This means there were more than 64 threads exiting at the same time and
>>> none of the first 64 threads could make any progress during 5 minutes.
>>> Sigh.
>>>
>>> To address this issue I suggest two things:
>>> 1)
>>> Increase the size of the queue of exiting threads.
>>> We'll still have to wait for only the first 64 threads, if the queue is
>>> exhausted.
>>> But the chances we hit this condition are greatly reduced.
>>>
>>> 2)
>>> Raise process_exiting flag earlier, i.e. before trying to enter the
>>> critical section.
>>> This should decrease the number of threads, contending for a slot in the
>>> 'handles' array during the process exit.
>>>
>>> Additionally, it is proposed to suspend all the threads, but the one
>>> which raised the flag 'process_exiting'.
>>> It would be important in a case, when two threads are about to call
>>> exit() concurrently.
>>> Otherwise, a race could be faced, if the first thread is waiting for all
>>> the registered handles, while the second one skips the critical section
>>> altogether and calls ::exit().
>>>
>>> Build went fine on all platforms.  The JTREG tests from 'hotspot' subset
>>> all pass.
>>>
>>> Would you please help review the proposed fix?
>>>
>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8145127
>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8145127/00/webrev/
>>
>> Can't quite get my head round the whole strategy, but in this loop:
>>
>> 4033       while (pr_ex != curr_id) {
>>
>> pr_ex is never updated!
>>
> It's intentional.
> It's not meant the current thread will ever leave this loop, once enters.
> The first line of the loop body will effectively suspend the current
> thread, so it will never call _endthreadex(), exit() or _exit() itself.
> Just for the extra protection, the SuspendThread() call is wrapped into
> an infinite loop.

while(true) would convey that much more clearly - and perhaps obviate 
the need for pr_ex.

Thanks,
David


> Sincerely yours,
> Ivan
>


More information about the serviceability-dev mailing list