<AWT Dev> [8] Review request for 7186109: Simplify lock machinery for PostEventQueue & EventQueue

Oleg Pekhovskiy oleg.pekhovskiy at oracle.com
Wed Sep 12 11:24:09 PDT 2012


Hi Peter,

I understood the case you described. That case is possible.

But... my task for now is to preserve the functionality we had before.
That is: Toolkit events posted BEFORE any call of EventQueue.postEvent() 
must go FIRST.
And we have a special test for that.

Thanks,
Oleg

9/12/2012 9:46 AM, Peter Levart wrote:
> Hi Oleg,
>
> The reasoning in my previous message, that I made just before going to bed,
> was not entirely correct. I made a mistake which was obvious in the morning.
> But even if .notifyAll() is correctly interpreted, the results are similar.
> So, once again:
>
> Imagine two threads (t1, t2) calling concurrently EventQueue.postEvent() -
> each with it's own event (e1, e2):
>
> It can happen that t1:
>
> - acquires the monitor in PostEventQueue 1st,
> - establishes the flushThread variable,
> - takes 1st snapshot of events,
> - releases the monitor,
> - flushes 1st snapshot of events,
> - re-acquires the monitor,
> - clears the flushThread variable,
> - broadcasts via "notifyAll()",
> - releases the monitor, ...
>
> when t2 that was waiting, wakes up:
>
> - acquires the monitor
> - establishes the flushThread variable,
> - takes a 2nd snapshot of events,
> - releases the monitor,
> - flushes 2nd snapshot of events,
> - re-acquires the monitor,
> - clears the flushThread variable,
> - broadcasts via "notifyAll()",
> - releases the monitor,
> - posts e2 to the EventQueue, ...
>
> and only after that t1 gets a chance to:
>
> - post e1 to the EventQueue
>
>
> The order of events that end up in EventQueue would still be as follows:
>
> 1st snapshot (taken by t1) of toolkit events,
> 2nd snapshot (taken by t2) of toolkit events,
> e2 (posted by t2),
> e1 (posted by t1)
>
> where "2nd snapshot" are toolkit events that were "born" after e1.
>
> Probability of such a scenario is very low, but a higher probability is that
> at least some events from 2nd snapshot get inserted into EventQueue before e1.
>
> The point I was trying to make is that as it is done currently, events that
> end up in the EventQueue are only roughly ordered according to the time at
> which they are posted to the queues by different threads. Ordering among
> threads is not specified (the events don't have a timestamp or a sequence
> allocated) Ordering is only maintained among events posted from the same
> thread.
>
> Different approaches are only more or less fair as to the ordering by "real"
> time.
>
> Regards, Peter
>
> On Wednesday, September 12, 2012 12:02:17 AM Peter Levart wrote:
>> Hi Oleg,
>>
>> On Tuesday, September 11, 2012 04:03:01 PM Oleg Pekhovskiy wrote:
>>> Hi Peter,
>>>
>>> please, see my comments below...
>>>
>>> Thanks,
>>> Oleg
>>>
>>> 9/11/2012 9:42 AM, Peter Levart wrote:
>>>> On Tuesday, September 11, 2012 01:25:39 AM Oleg Pekhovskiy wrote:
>>>>> Hi Peter,
>>>>>
>>>>> your idea might work if drainTo() is used before while-cycle,
>>>>> otherwise the order of events could be broken sometimes.
>>>> I don't know how. The flush() is synchronized! No two threads can
>>>> execute
>>>> while-poll at the same time. And LinkedBlockingQueue is used as FIFO, so
>>>> it does not matter if toolkit thread posts any events concurrently - the
>>>> order of events is the same.
>>> Imagine the situation when flushing caused by EventQueue.postEvent()
>>> (called from any thread) is in progress.
>>> Toolkit thread calls PostEventQueue.postEvent() while poll() is being
>>> calling over and over again.
>>> As a result the message posted by Toolkit thread would be posted to
>>> EventQueue within current flushing cycle.
>>> That would happen before posting of the event that caused flushing (see
>>> EventQueue.postEvent()) and would violate the order of messages.
>> There's no absolute guarantee anyway since there's allways a window between
>> SunToolkit.flushPendingEvents() and acquire-ing a pushPopLock where anything
>> can happen...
>>
>> Imagine two threads (t1, t2) calling concurrently EventQueue.postEvent() -
>> each with it's own event (e1, e2):
>>
>> It can happen that t1:
>>
>> - acquires the monitor in PostEventQueue 1st,
>> - establishes the flushThread variable,
>> - takes 1st snapshot of events,
>> - releases the monitor,
>> - flushes 1st snapshot of events,
>> - re-acquires the monitor,
>> - clears the flushThread variable,
>> - broadcasts via "notifyAll()" which temporarily releases the monitor, ...
>>
>> when t2 that was waiting, wakes up:
>>
>> - acquires the monitor
>> - establishes the flushThread variable,
>> - takes a 2nd snapshot of events,
>> - releases the monitor,
>> - flushes 2nd snapshot of events,
>> - re-acquires the monitor,
>> - clears the flushThread variable,
>> - broadcasts via "notifyAll()" which temporarily releases the monitor,
>> - reacquires the monitor,
>> - releases the monitor
>> - posts e2 to the EventQueue, ...
>>
>> and only after that t1 gets a chance to:
>>
>> - re-acquire the monitor
>> - release the monitor
>> - post e1 to the EventQueue
>>
>>
>> The order of events that end up in EventQueue is then as follows:
>>
>> 1st snapshot (taken by t1) of toolkit events,
>> 2nd snapshot (taken by t2) of toolkit events,
>> e2 (posted by t2),
>> e1 (posted by t1)
>>
>> where "2nd snapshot" are toolkit events that were "born" after e1.
>>
>>
>> Nobody guarantees that the event posted by EventQueue.postEvent will be the
>> 1st event to arrive into the EventQueue after the snapshot of toolkit events
>> taken by the same thread at just the same time.
>>
>> This order could only be maintained if SunToolkit.flushPendingEvents() was
>> allways called as part of the critical section in the EventQueue guarded by
>> pushPopLock. But then this would have to be guaranteed allways or there is a
>> chance of deadlock.
>>
>> (It's true that flushing is not done while holding a monitor on
>> PostEventQueue any more, but flushingThread variable and a wait loop at the
>> begginig of flush() actualy constitue a self-made mutex lock which guards
>> the section containing flushing loop and can equally so participate in a
>> dead-lock situation.)
>>
>> Since SunToolkit.flushPendingEvents() is public, anyone can call it, so this
>> is dangerous...
>>
>> Regards, Peter
>>
>> P.S. I'm just thinking loud:
>>
>> What about a scheme where there was a special "copying" thread employed for
>> transfering events from PostEventQueue to EventQueue? No
>> SunToolkit.flushPendingEvents() method is needed any more then.
>>
>>>>> Also, usage of LinkedBlockingQueue could lead to performance decrease
>>>> Internally it uses ReentrantLock, which in flush while-poll loop is
>>>> acquired once per poll. Uncontended acquire is just a CAS. I don't think
>>>> that in this context (GUI events) it presents any difference. So any
>>>> approach is good-enough.>
>>>>
>>>>> (especially with drainTo()).
>>>>>
>>>>> The class has more complicated logic entailing pitfalls in future.
>>>> That might be true:
>>>>
>>>> http://stackoverflow.com/questions/12349881/why-linkedblockingqueuepoll-
>>>> ma
>>>> y-hang-up
>>> I'm not denying that CAS in most cases is faster than locking. But if
>>> you look at the implementation of LinkedBlockingQueue you'll see some
>>> unnecessary things (for our case) as this class is quite common.
>>>
>>>>> Thanks,
>>>>> Oleg
>>>> Regards, Peter
>>>>
>>>>> 08.09.2012 21:39, Peter Levart wrote:
>>>>>> Hi Oleg,
>>>>>>
>>>>>> I still think that there is room for simplification. Now that the
>>>>>> flush
>>>>>> can be synchronized again (because EventQueue.detatchDispatchThread is
>>>>>> not calling SunToolkit.PostEventQueue's synchronized methods while
>>>>>> holding pushPopLock), we just have to make sure that toolkit thread is
>>>>>> not blocked by flushing.
>>>>>>
>>>>>> Here's a simplified PostEventQueue that does that:
>>>>>>
>>>>>> class PostEventQueue {
>>>>>>
>>>>>>       private final Queue<AWTEvent>   queue = new
>>>>>>       LinkedBlockingQueue<>();
>>>>>>       //
>>>>>>       unbounded private final EventQueue eventQueue;
>>>>>>
>>>>>>       private boolean isFlushing;
>>>>>>
>>>>>>       PostEventQueue(EventQueue eq) {
>>>>>>
>>>>>>          eventQueue = eq;
>>>>>>
>>>>>>       }
>>>>>>
>>>>>>       public synchronized void flush() {
>>>>>>
>>>>>>          if (isFlushing)
>>>>>>
>>>>>>             return;
>>>>>>
>>>>>>          isFlushing = true;
>>>>>>          try {
>>>>>>
>>>>>>             AWTEvent event;
>>>>>>             while ((event = queue.poll()) != null)
>>>>>>
>>>>>>                eventQueue.postEvent(event);
>>>>>>
>>>>>>          }
>>>>>>          finally {
>>>>>>
>>>>>>             isFlushing = false;
>>>>>>
>>>>>>          }
>>>>>>
>>>>>>       }
>>>>>>
>>>>>>       /*
>>>>>>       * Enqueue an AWTEvent to be posted to the Java EventQueue.
>>>>>>       */
>>>>>>       void postEvent(AWTEvent event) {
>>>>>>
>>>>>>          queue.offer(event);
>>>>>>          SunToolkit.wakeupEventQueue(eventQueue, event.getSource() ==
>>>>>>          AWTAutoShutdown.getInstance());>
>>>>>>
>>>>>>       }
>>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>> This implementation also finishes the flush in the presence of
>>>>>> interrupts...
>>>>>>
>>>>>> Peter
>>>>>>
>>>>>> On Saturday, September 08, 2012 04:09:40 AM Oleg Pekhovskiy wrote:
>>>>>>> Artem, Anthony, David,
>>>>>>>
>>>>>>> please review the next version of fix:
>>>>>>> http://cr.openjdk.java.net/~bagiras/8/7186109.4/
>>>>>>>
>>>>>>> What's new in comparison with the previous version:
>>>>>>>
>>>>>>> 1. Removed "isThreadLocalFlushing" and "isFlashing", created
>>>>>>> "flushThread" instead
>>>>>>> (stores the thread that currently performs flushing).
>>>>>>> 2. Implemented both recursion and multi-thread block on "flushThread"
>>>>>>> only.
>>>>>>> 3. Added Thread.interrupt() to the catch block as outside we would
>>>>>>> like
>>>>>>> to know what happened in PostEventQueue.flush().
>>>>>>> We are not able to rethrow the exception because we couldn't change
>>>>>>> the
>>>>>>> signature (by adding "throwns")
>>>>>>> for all related methods (e.g. EventQueue.postEvent()).
>>>>>>>
>>>>>>> The fix successfully passed
>>>>>>> "closed/java/awt/EventQueue/PostEventOrderingTest.java" test.
>>>>>>>
>>>>>>> IMHO, the code became clearer.
>>>>>>>
>>>>>>> Looking forward to your comments!
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Oleg
>>>>>>>
>>>>>>> 8/30/2012 9:19 PM, Oleg Pekhovskiy wrote:
>>>>>>>> There are also other 2 methods that will require 'throws
>>>>>>>> InterruptedException' or try-catch:
>>>>>>>> 1. EventQueue.postEvent()
>>>>>>>> 2. EventQueue.removeSourceEvents()
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Oleg
>>>>>>>>
>>>>>>>> 8/30/2012 9:01 PM, Oleg Pekhovskiy wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I got another idea preparing the next version of fix.
>>>>>>>>>
>>>>>>>>> Previously we didn't catch InterruptedException and stack unwinding
>>>>>>>>> took place right up to
>>>>>>>>> try-catch section in EventDispatchThread.pumpOneEventForFilters().
>>>>>>>>>
>>>>>>>>> So seems like it would be correct not eating that exception in
>>>>>>>>> PostEventQueue.flush()
>>>>>>>>> but just check the state using isInterrupted() method and add
>>>>>>>>> 'throws
>>>>>>>>> InterruptedException'
>>>>>>>>> to PostEventQueue.flush() and SunToolkit.flushPendingEvents()
>>>>>>>>> methods.
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Oleg
>>>>>>>>>
>>>>>>>>> 8/30/2012 5:20 PM, Anthony Petrov wrote:
>>>>>>>>>> I see. After giving it more thought I don't see an easy solution
>>>>>>>>>> for
>>>>>>>>>> this issue, too. As long as we guarantee that the EDT is recreated
>>>>>>>>>> should more events be posted, I don't see a big problem with this.
>>>>>>>>>> So let's stick with the "Minimize..." approach then.
>>>>>>>>>>
>>>>>>>>>> On 08/30/12 00:18, Oleg Pekhovskiy wrote:
>>>>>>>>>>> Hi Anthony,
>>>>>>>>>>>
>>>>>>>>>>> I see your concerns.
>>>>>>>>>>>
>>>>>>>>>>> As PostEventQueue.flush() method left unsynchronized,
>>>>>>>>>>> we potentially could return PostEventQueue.noEvents()
>>>>>>>>>>> and return check in EventQueue.detachDispatchThread()
>>>>>>>>>>> back to the condition.
>>>>>>>>>>> But is could increase the possibility of deadlock in future
>>>>>>>>>>> (with PostEventQueue&   pushPopLock).
>>>>>>>>>>>
>>>>>>>>>>> Artem, what do you think?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Oleg
>>>>>>>>>>>
>>>>>>>>>>> 29.08.2012 15:22, Anthony Petrov wrote:
>>>>>>>>>>>> On 8/29/2012 3:08 PM, Anthony Petrov wrote:
>>>>>>>>>>>>> Hi Oleg,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm still concerned about the following:
>>>>>>>>>>>>>
>>>>>>>>>>>>> detachDispatchThread()
>>>>>>>>>>>>> {
>>>>>>>>>>>>> flush();
>>>>>>>>>>>>> lock();
>>>>>>>>>>>>> // possibly detach
>>>>>>>>>>>>> unlock();
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>> at EventQueue.java. What if an even get posted to the queue
>>>>>>>>>>>>> after
>>>>>>>>>>>>> the
>>>>>>>>>>>> A typo: s/even get/event gets/.
>>>>>>>>>>>>
>>>>>>>>>>>>> flush() returns but before we even acquired the lock? We may
>>>>>>>>>>>>> still
>>>>>>>>>>>>> end up with a situation when we detach the thread while in fact
>>>>>>>>>>>>> there
>>>>>>>>>>>>> are some pending events present, which actually contradicts the
>>>>>>>>>>>>> current logic of the detach() method. I see that you say
>>>>>>>>>>>>> "Minimize
>>>>>>>>>>>>> discard possibility" in the comment instead of "Prevent ...",
>>>>>>>>>>>>> but
>>>>>>>>>>>>> I
>>>>>>>>>>>>> feel uncomfortable with this actually.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What exactly prevents us from adding some synchronization to
>>>>>>>>>>>>> ensure
>>>>>>>>>>>>> that the detaching only happens when there's really no pending
>>>>>>>>>>>>> events?
>>>>>>>>>>>>>
>>>>>>>>>>>>> SunToolkit.java:
>>>>>>>>>>>>>> 2120 Boolean b = isThreadLocalFlushing.get();
>>>>>>>>>>>>>> 2121 if (b != null&&   b) {
>>>>>>>>>>>>>> 2122 return;
>>>>>>>>>>>>>> 2123 }
>>>>>>>>>>>>>> 2124 2125 isThreadLocalFlushing.set(true);
>>>>>>>>>>>>>> 2126 try {
>>>>>>>>>>>>> How does access to the isThreadLocalFlushing synchronized? What
>>>>>>>>>>>>> happens if the flush() is being invoked from two different
>>>>>>>>>>>>> threads
>>>>>>>>>>>>> for the same post event queue? Why do we have two "isFlushing"
>>>>>>>>>>>>> flags?
>>>>>>>>>>>>> Can we collapse them into one? Why is the isFlushing set/reset
>>>>>>>>>>>>> in
>>>>>>>>>>>>> two
>>>>>>>>>>>>> disjunct synchronized(){} blocks?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Overall, I find the current synchronization scheme in flush()
>>>>>>>>>>>>> very,
>>>>>>>>>>>>> *very* (and I mean it) confusing. Can we simplify it somehow?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 8/28/2012 6:33 PM, Oleg Pekhovskiy wrote:
>>>>>>>>>>>>>> Hi Artem, Anthony,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thank you for your proposals!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We with Artem also had off-line discussion,
>>>>>>>>>>>>>> so as a result I prepared improved version of fix:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~bagiras/8/7186109.3/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What was done:
>>>>>>>>>>>>>> 1. EventQueue.detachDispatchThread(): moved
>>>>>>>>>>>>>> SunToolkit.flushPnedingEvents() above the comments and added a
>>>>>>>>>>>>>> separate comment to it.
>>>>>>>>>>>>>> 2. Moved SunToolkitSubclass.flushPendingEvents(AppContext)
>>>>>>>>>>>>>> method to
>>>>>>>>>>>>>> SunToolkit. Deleted SunToolkitSubclass.
>>>>>>>>>>>>>> 3. Moved isFlushingPendingEvents to PostEventQueue with the
>>>>>>>>>>>>>> new
>>>>>>>>>>>>>> name
>>>>>>>>>>>>>> - isThreadLocalFlushing and made it ThreadLocal.
>>>>>>>>>>>>>> 4. Left PostEventQueue.flush() unsynchronized and created
>>>>>>>>>>>>>> wait()-notifyAll() synchronization mechanism to avoid blocking
>>>>>>>>>>>>>> of
>>>>>>>>>>>>>> PostEventQueue.postEvent().
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Looking forward to your comments!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Oleg
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 20.08.2012 20:20, Artem Ananiev wrote:
>>>>>>>>>>>>>>> Hi, Oleg,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> here are a few comments:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. What is the reason of keeping "isFlushingPendingEvents" in
>>>>>>>>>>>>>>> SunToolkit, given that PEQ.flush() is synchronized (and
>>>>>>>>>>>>>>> therefore
>>>>>>>>>>>>>>> serialized) anyway?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. flushPendingEvents(AppContext) may be moved directly to
>>>>>>>>>>>>>>> SunToolkit, so we don't need a separate sun-class for that.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3. EQ.java:1035-1040 - this comment is obsolete and must be
>>>>>>>>>>>>>>> replaced by another one.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Artem
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 8/17/2012 4:49 PM, Oleg Pekhovskiy wrote:
>>>>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please review the fix for CR:
>>>>>>>>>>>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7186109
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~bagiras/8/7186109.1/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The following changes were made:
>>>>>>>>>>>>>>>> 1. Removed flushLock from SunToolkit.flushPendingEvent()
>>>>>>>>>>>>>>>> 2. Returned method PostEventQueue.flush() as 'synchronized'
>>>>>>>>>>>>>>>> back
>>>>>>>>>>>>>>>> 3. Added call of SunToolkit.flushPendingEvents() to
>>>>>>>>>>>>>>>> EventQueue.detachDispatchThread(),
>>>>>>>>>>>>>>>> right before pushPopLock.lock()
>>>>>>>>>>>>>>>> 4. Removed !SunToolkit.isPostEventQueueEmpty() check from
>>>>>>>>>>>>>>>> EventQueue.detachDispatchThread()
>>>>>>>>>>>>>>>> 5. Removed SunToolkit.isPostEventQueueEmpty()&
>>>>>>>>>>>>>>>> PostEventQueue.noEvents();
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Oleg
>>>>>>>>>>>>>>>> <http://cr.openjdk.java.net/%7Ebagiras/8/7186109.1/>




More information about the awt-dev mailing list