<AWT Dev> <Awt Dev> [9] Review Request for 8132664: closed/javax/swing/DataTransfer/DefaultNoDrop/DefaultNoDrop.java locks on Windows

Semyon Sadetsky semyon.sadetsky at oracle.com
Wed Sep 16 07:27:35 UTC 2015


Sergey,  just a reminder.

On 9/8/2015 7:17 PM, Semyon Sadetsky wrote:
>
>
> On 9/8/2015 6:13 PM, Sergey Bylokhov wrote:
>> On 08.09.15 18:02, Semyon Sadetsky wrote:
>>>
>>>
>>> On 9/8/2015 2:06 PM, Sergey Bylokhov wrote:
>>>> On 08.09.15 13:42, Semyon Sadetsky wrote:
>>>>>
>>>>>
>>>>> On 9/8/2015 1:31 PM, Sergey Bylokhov wrote:
>>>>>> On 08.09.15 13:15, Semyon Sadetsky wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 9/8/2015 12:33 PM, Sergey Bylokhov wrote:
>>>>>>>> On 08.09.15 12:04, Semyon Sadetsky wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9/7/2015 6:56 PM, Sergey Bylokhov wrote:
>>>>>>>>>> On 07.09.15 16:41, Semyon Sadetsky wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 9/7/2015 3:28 PM, Sergey Bylokhov wrote:
>>>>>>>>>>>> On 03.09.15 17:43, Semyon Sadetsky wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 8/5/2015 2:33 PM, Sergey Bylokhov wrote:
>>>>>>>>>>>>>> On 05.08.15 14:20, Semyon Sadetsky wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 8/5/2015 1:39 PM, Sergey Bylokhov wrote:
>>>>>>>>>>>>>>>> On 05.08.15 13:18, Semyon Sadetsky wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 8/5/2015 12:27 PM, Sergey Bylokhov wrote:
>>>>>>>>>>>>>>>>>> On 04.08.15 14:54, Semyon Sadetsky wrote:
>>>>>>>>>>>>>>>>>>> On 8/3/2015 6:05 PM, Sergey Bylokhov wrote:
>>>>>>>>>>>>>>>>>>>> Hi, Semyon
>>>>>>>>>>>>>>>>>>>> Did you try to change dwMilliseconds from INFINITE 
>>>>>>>>>>>>>>>>>>>> to the
>>>>>>>>>>>>>>>>>>>> timeout(10 seconds by default?) which is passed to the
>>>>>>>>>>>>>>>>>>>> method?
>>>>>>>>>>>>>>>>>>>> It does not help? Because even when dnd is not used we
>>>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>>> not wait event for infinite time.
>>>>>>>>>>>>>>>>>>> It would not help to fix the issue because 10 
>>>>>>>>>>>>>>>>>>> seconds is
>>>>>>>>>>>>>>>>>>> too big
>>>>>>>>>>>>>>>>>>> interval. But for consistency it is not bad to have 
>>>>>>>>>>>>>>>>>>> a time
>>>>>>>>>>>>>>>>>>> limit.
>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ssadetsky/8132664/webrev.01/ 
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Note that syncNativeQueue is intended to wait until the
>>>>>>>>>>>>>>>>>> native
>>>>>>>>>>>>>>>>>> event queue is flushed or until timeout will expire. So
>>>>>>>>>>>>>>>>>> even if
>>>>>>>>>>>>>>>>>> timeout is expired we collect the native events 
>>>>>>>>>>>>>>>>>> during this
>>>>>>>>>>>>>>>>>> period
>>>>>>>>>>>>>>>>>> of time.Can you double check that the event counter is
>>>>>>>>>>>>>>>>>> incremented
>>>>>>>>>>>>>>>>>> during dnd? I do not know how we block the toolkit 
>>>>>>>>>>>>>>>>>> thread,
>>>>>>>>>>>>>>>>>> probably we create some nested loops which ignore our 
>>>>>>>>>>>>>>>>>> event
>>>>>>>>>>>>>>>>>> posted
>>>>>>>>>>>>>>>>>> from syncNativeQueue, can we change that?
>>>>>>>>>>>>>>>>> Yes, this is an internal secondary loop which waits for
>>>>>>>>>>>>>>>>> mouse
>>>>>>>>>>>>>>>>> release event.
>>>>>>>>>>>>>>>> Can we change the condition and process the sync event in
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> loop?
>>>>>>>>>>>>>>> Why? Will receive all events on the toolkit thread when
>>>>>>>>>>>>>>> doDragDrop
>>>>>>>>>>>>>>> returns.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When how we get dragenter/exit events?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On the platform side they are not events but callbacks 
>>>>>>>>>>>>> which are
>>>>>>>>>>>>> converted into events on the java side and added to the
>>>>>>>>>>>>> queue. So
>>>>>>>>>>>>> they
>>>>>>>>>>>>> will be detected by isEQEmpty() and waitForIdle() will be
>>>>>>>>>>>>> repeated.
>>>>>>>>>>>>
>>>>>>>>>>>> This is not directly related to this fix. Most of our
>>>>>>>>>>>> callbacks/events
>>>>>>>>>>>> posts events to the EQ, but there is a possibility for a lag
>>>>>>>>>>>> between
>>>>>>>>>>>> callback and a post events, and this is why the
>>>>>>>>>>>> syncNativeQueue was
>>>>>>>>>>>> added.
>>>>>>>>>>>> Let's return to the beginning: the syncNativeQueue method
>>>>>>>>>>>> according to
>>>>>>>>>>>> its specifications should try to flush the native system,
>>>>>>>>>>>> track the
>>>>>>>>>>>> native activity and should not wait more than timeout 
>>>>>>>>>>>> parameter.
>>>>>>>>>>>
>>>>>>>>>>> It is not possible to send sync events to the DD loop. Only
>>>>>>>>>>> mouse/keyboard event can return control to the app. I would not
>>>>>>>>>>> emulate
>>>>>>>>>>> them in the syncnative.
>>>>>>>>>>> Always waiting the max time is not good. Maybe wait a bit and
>>>>>>>>>>> check if
>>>>>>>>>>> no callbacks happened?
>>>>>>>>>>
>>>>>>>>>> It will be good to check that callbacks were occurred in the
>>>>>>>>>> same way
>>>>>>>>>> as native events. This can be done in this fix. As for question
>>>>>>>>>> about
>>>>>>>>>> wait or not when timeout is over, is a shared problem, which 
>>>>>>>>>> affect
>>>>>>>>>> the osx also. It can be fixed separately. I guess the logic in
>>>>>>>>>> realSync can be tweaked and the default timeout(10*4 sec) is 
>>>>>>>>>> not a
>>>>>>>>>> optimal as a default value, 2 or 3 seconds without any events
>>>>>>>>>> should
>>>>>>>>>> be enough.
>>>>>>>>> Could you explain your logic?
>>>>>>>>
>>>>>>>> My logic is to follow the specification.
>>>>>>> I cannot accept this as an answer because it is an abstract
>>>>>>> declaration.
>>>>>>> Please, try to be more specific.
>>>>>>
>>>>>> Just implement as I described it above.
>>>>>>
>>>>> Sorry?
>>>>
>>>> Just change implementation of syncNativeQueue to don't block forever
>>>> if timeout parameter is positive. And wait forever in case of negative
>>>> timeout. This is enough to fix for now, now need to add some
>>>> workarounds for dnd, since the realsync itself should be reworked.
>>>>
>>> Sergey, I'm a bit tired of this level of explanations.
>>
>> Explanations is simple just implement the method according 
>> specification, since the problem is general and is not specific it 
>> should be fixed on the shared level.
>>
> An internal method javadoc can be changed any time.
>>> What do you mean under "for now" and why the realsync itself should be
>>> reworked?
>>
>> For now means for now. There are a few fixes which were pushed to the 
>> jdk9 in to realsync area, as well as a few fixes to the tests 
>> (regression/sqe), and we already got a bunch of bugs. And "for now" 
>> we should not implement any workarounds since we know that the shared 
>> code contains errors.
> Can you give me the bug number you are talking about?
>>
>>> The solution fixes the problem. Test is passed.
>>
>>
>> Yes, the test still fail after the fix, the reason will be not a lock 
>> but a slow execution.
>>
> It is not true. I just rerun the test with updated jdk9 client 
> codebase under windows and it passed in 17 seconds:
>
> ----------messages:(3/117)----------
> command: main DefaultNoDrop
> reason: User specified action: run main DefaultNoDrop
> elapsed time (seconds): 17.339
> ----------System.out:(8/177)----------
> invoking: onEDT1
> invoking: onEDT10
> invoking: onBackgroundThread20
> invoking: onEDT30
> invoking: onEDT40
> invoking: onBackgroundThread50
> invoking: onEDT60
> invoking: onEDT70
> ----------System.err:(1/16)----------
> STATUS:Passed.
>
>>
>> If you cannot explain in
>>> words why it does not work for you, maybe you can provide a test case
>>> which fails with the proposed solution?
>>>>
>>>>>>
>>>>>>> I can only make guesses what specification section you mean. For
>>>>>>> example, the next is waitForIdle() method's spec:
>>>>>>>      /**
>>>>>>>       * Waits until all events currently on the event queue have 
>>>>>>> been
>>>>>>> processed.
>>>>>>>       * @throws  IllegalThreadStateException if called on the AWT
>>>>>>> event
>>>>>>> dispatching thread
>>>>>>>       */
>>>>>>> What does specifically contradicts to this specification?
>>>>>>
>>>>>> Check the documentation of the realSync method.
>>>>>>
>>>>> realSync is an internal method. It is not a part of the 
>>>>> specification.
>>>>> It has this description of the timeout parameter:
>>>>>       * @param timeout the maximum time to wait in milliseconds,
>>>>> negative means "forever".
>>>>>
>>>>> That is as it used in the fix. No?
>>>>>>
>>>>>>>>
>>>>>>>>> Why to wait short wait period doesn't work
>>>>>>>>> if we are in DD and there are no callbacks?
>>>>>>>>
>>>>>>>> "No callbacks" during what period? My point: this period should be
>>>>>>>> passed to this method.
>>>>>>> Period is passed to the method.
>>>>>>> Java_sun_awt_windows_WToolkit_syncNativeQueue(JNIEnv *env, jobject
>>>>>>> self,
>>>>>>> jlong timeout)
>>>>>>> It's meaning is a _maximum_ time the method should wait. Before the
>>>>>>> fix
>>>>>>> it was ignored and the method never returned.
>>>>>>> With the fix the timeout is taken into account but this aspect 
>>>>>>> of the
>>>>>>> fix is not related to the original issue and it cannot fix the 
>>>>>>> issue
>>>>>>> alone.
>>>>>>> Or you meant another method? Again, please be more specific 
>>>>>>> otherwise
>>>>>>> this discussion can be infinite.
>>>>>>>>
>>>>>>>>> No callbacks means no
>>>>>>>>> activity and in analogous situations the method returns true.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> I see a clear disadvantage: even with 2-3 second wait period the
>>>>>>>>> test
>>>>>>>>> will executes much longer than on Linux platform because D&D is
>>>>>>>>> triggered every 3 mouse drags . Also other D&D tests suites will
>>>>>>>>> demand
>>>>>>>>> much more time to pass the windows platform. Users can report
>>>>>>>>> bugs if
>>>>>>>>> AWT Robot starts to produce multi-seconds delay after drag 
>>>>>>>>> events.
>>>>>>>>> Maybe you can provide an example scenario which demonstrates that
>>>>>>>>> it is
>>>>>>>>> really necessary to pay this price?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Event counter is not changed during toolkit thread
>>>>>>>>>>>>>>>>> blocking of
>>>>>>>>>>>>>>>>> cause. Not sure that we can change that. But since 
>>>>>>>>>>>>>>>>> toolkit
>>>>>>>>>>>>>>>>> queue is
>>>>>>>>>>>>>>>>> blocked we can assume that we are synced.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The timeout value is maintained on the shared level and
>>>>>>>>>>>>>>>>>> actually
>>>>>>>>>>>>>>>>>> this test will fail with timeout on osx as well
>>>>>>>>>>>>>>>>>> JDK-7185258.
>>>>>>>>>>>>>>>>>> The
>>>>>>>>>>>>>>>>>> test will fail even if the time out will be changed to
>>>>>>>>>>>>>>>>>> ±100ms,
>>>>>>>>>>>>>>>>>> because it call realsync on each pixel move, ±200 times.
>>>>>>>>>>>>>>>>>> This
>>>>>>>>>>>>>>>>>> problem can be fixed in different ways like tweak of
>>>>>>>>>>>>>>>>>> timeouts and
>>>>>>>>>>>>>>>>>> numbers of iterations, or changing the test to call 
>>>>>>>>>>>>>>>>>> w4idle
>>>>>>>>>>>>>>>>>> only
>>>>>>>>>>>>>>>>>> after the latest move(actually I think this method 
>>>>>>>>>>>>>>>>>> can be
>>>>>>>>>>>>>>>>>> moved to
>>>>>>>>>>>>>>>>>> the robot class).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So I still think that the right fix for a deadlock,
>>>>>>>>>>>>>>>>>> which is
>>>>>>>>>>>>>>>>>> subject of this bug, is simply change the
>>>>>>>>>>>>>>>>>> syncNativeQueue and
>>>>>>>>>>>>>>>>>> waits using a timeout if it is positive, and waits
>>>>>>>>>>>>>>>>>> forever if
>>>>>>>>>>>>>>>>>> timeout is negative (the same bug on osx JDK-8080504).
>>>>>>>>>>>>>>>>> I'm not sure that waiting brings any value. What do you
>>>>>>>>>>>>>>>>> propose to
>>>>>>>>>>>>>>>>> return if it timed out? The event counter will not be
>>>>>>>>>>>>>>>>> changed
>>>>>>>>>>>>>>>>> regardless of waiting.
>>>>>>>>>>>>>>>> But it should be changed, because we get native events
>>>>>>>>>>>>>>>> from the
>>>>>>>>>>>>>>>> system during dnd and in each such callback we should 
>>>>>>>>>>>>>>>> update
>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>> counter. If callbacks were not called=>counter was not
>>>>>>>>>>>>>>>> updated
>>>>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>> sync assume that currently we do not process events. If
>>>>>>>>>>>>>>>> callbacks
>>>>>>>>>>>>>>>> were called then sync assume that we have events in the
>>>>>>>>>>>>>>>> native
>>>>>>>>>>>>>>>> queue
>>>>>>>>>>>>>>>> and should try to sync again on the next iteration.
>>>>>>>>>>>>>>> No. Events are not processed while toolkit is blocked in
>>>>>>>>>>>>>>> doDragDrop(). The application state is frozen on that 
>>>>>>>>>>>>>>> period.
>>>>>>>>>>>>>>> That is
>>>>>>>>>>>>>>> Windows approach.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> With such waiting the test will fail because of either 
>>>>>>>>>>>>>>>>> jtreg
>>>>>>>>>>>>>>>>> timeout
>>>>>>>>>>>>>>>> Default timeout is 120 seconds for everything, the test
>>>>>>>>>>>>>>>> try to
>>>>>>>>>>>>>>>> sync
>>>>>>>>>>>>>>>> the queue 200 of times after each move, so yes it can fail
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> timeout even if spend in nativesync 200 ms, the possible
>>>>>>>>>>>>>>>> solutions
>>>>>>>>>>>>>>>> were in my previous email.
>>>>>>>>>>>>>>>>> either InfiniteLoop exception.
>>>>>>>>>>>>>>>> This exception will be disabled by default lately in jdk9
>>>>>>>>>>>>>>>> timeframe,
>>>>>>>>>>>>>>>> right now it helps to find some related issues.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On 03.08.15 17:26, Semyon Sadetsky wrote:
>>>>>>>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please review fix for JDK9:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8132664
>>>>>>>>>>>>>>>>>>>>> webrev:
>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~ssadetsky/8132664/webrev.00/ 
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> DoDragDrop() is blocking, so upon drag operation is
>>>>>>>>>>>>>>>>>>>>> triggered
>>>>>>>>>>>>>>>>>>>>> the toolkit thread is blocked and the WM_AWT_WAIT
>>>>>>>>>>>>>>>>>>>>> cannot be
>>>>>>>>>>>>>>>>>>>>> processed which in its turn blocks the AWT robot.
>>>>>>>>>>>>>>>>>>>>> The solution is to escape AWT robot waiting in
>>>>>>>>>>>>>>>>>>>>> syncNativeQueue() if drag operation is in progress.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --Semyon
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>



More information about the awt-dev mailing list