<AWT Dev> [8] Review request for 8015730: PIT: On Linux, OGL=true and fbobject=false leads to deadlock or infinite loop

Artem Ananiev artem.ananiev at oracle.com
Thu Jul 4 02:31:25 PDT 2013


On 7/3/2013 9:49 PM, Anthony Petrov wrote:
> On 07/03/2013 08:35 PM, Artem Ananiev wrote:
>>> src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c
>>>>  398     XSync(awt_display, False);
>>
>> OK, got it.
>>
>>> All other usages of the EXEC_WITH_XERROR_HANDLER() pass TRUE to the
>>> RESTORE_XERROR_HANDLER(), and hence call XSync(), too (in
>>> XErrorHandlerUtil.java) with or without the current fix...
>>>
>> I looked to the webrev for 8005607 once again and found XSync() was
>> there before that fix. And you're right, it's required to flush Xlib
>> quene.
>>
>> OK, so I currently don't see a way to improve the current fix. It may be
>> possible to rewrite OGL render queue synchronization, but it's out of
>> scope of this fix.
>
> I agree that this is out of the scope of the current fix. However, the
> very fact of XSync() being invoked w/o holding the AWTLock needs to be
> addressed at some point. So I suggest to file a separate P4 bug to
> investigate this later (probably in a future release).

The lock is actually hold, but by another thread (which then is waiting 
for the flusher thread). As AWT lock is used to prevent access to X 
display simultaneously from multiple threads, not to conform to Java 
memory model, such a pattern seems to be valid. It's not trivial to 
understand, though, I agree.

Thanks,

Artem

> --
> best regards,
> Anthony
>
>>
>> Thanks,
>>
>> Artem
>>
>>> --
>>> best regards,
>>> Anthony
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Artem
>>>>
>>>>> --
>>>>> best regards,
>>>>> Anthony
>>>>>
>>>>> On 07/01/2013 07:24 PM, Anton Litvinov wrote:
>>>>>> Hello Anthony,
>>>>>>
>>>>>> Thank you for the review of this fix. I would like to remark that
>>>>>> this
>>>>>> deadlock is a regression of the fix for the bug 8005607, and in the
>>>>>> code
>>>>>> of the file
>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>>>>>> before 8005607 fix, where the previous XError handling mechanism not
>>>>>> involving "sun.awt.X11.XErrorHandlerUtil" class was used, native
>>>>>> "XSync()" function was called without acquiring of AWT lock. So a fix
>>>>>> for the current bug with a deadlock just reverted a part of the fix
>>>>>> 8005607 which enforced taking AWT lock from the function
>>>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" in the file
>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c".
>>>>>> Answers to
>>>>>> your questions are provided below.
>>>>>>
>>>>>> 1. "AWT EventQueue" holds AWT lock and waits till "Java2D Queue
>>>>>> Flusher"
>>>>>> thread finishes its job, because in the method
>>>>>> "sun.java2d.opengl.OGLSurfaceData.initSurface(final int width, final
>>>>>> int
>>>>>> height)" execution of "initSurfaceNow(int width, int height)" is
>>>>>> dispatched to "Java2D Queue Flusher". Before this dispatching in the
>>>>>> method "initSurface" AWT lock is taken by the lines
>>>>>>
>>>>>>      308        OGLRenderQueue rq = OGLRenderQueue.getInstance();
>>>>>>      309        rq.lock();
>>>>>>
>>>>>> and then with held AWT lock "AWT EventQueue" thread starts waiting on
>>>>>> the second lock "sun.java2d.opengl.OGLRenderQueue.flusher" in the
>>>>>> method
>>>>>> "sun.java2d.opengl.OGLRenderQueue.QueueFlusher.flushNow()"
>>>>>>
>>>>>>      181                    wait();
>>>>>>
>>>>>> 2. Yes, I investigated the option of waiting on AWT lock instead of
>>>>>> "sun.java2d.opengl.OGLRenderQueue.flusher" lock in the class
>>>>>> "sun.java2d.opengl.OGLRenderQueue", but this is impossible, because
>>>>>> access to the always running thread "Java2D Queue Flusher" should be
>>>>>> synchronized on some lock other than AWT lock, otherwise there will
>>>>>> be a
>>>>>> performance degradation, because it will be trying to get AWT lock
>>>>>> each
>>>>>> 100 milliseconds. As I understood a possible solution for this
>>>>>> problem
>>>>>> can be not locking on AWT lock before dispatching execution of any
>>>>>> code
>>>>>> to "Java2D Queue Flusher" or complete refactoring of locking
>>>>>> mechanism
>>>>>> in the class "sun.java2d.opengl.OGLRenderQueue". Since the current
>>>>>> bug
>>>>>> blocks SQE from running any tests involving OpenGL and does not
>>>>>> allow to
>>>>>> run any Java GUI application with enabled OpenGL rendering on Linux
>>>>>> OS,
>>>>>> I suppose the variant of refactoring is not acceptable. That is
>>>>>> why as
>>>>>> the most secure solution I decided just to call XSync() from
>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" as it was
>>>>>> before the fix for 8005607.
>>>>>>
>>>>>> Thank you,
>>>>>> Anton
>>>>>>
>>>>>> On 7/1/2013 3:11 PM, Anthony Petrov wrote:
>>>>>>> Hi Anton,
>>>>>>>
>>>>>>> I'm not sure if this a good fix since it enabled the GL thread to
>>>>>>> call
>>>>>>> Xlib APIs w/o acquiring the AWTLock. This may not present a problem
>>>>>>> currently since we know exactly when this method is called and that
>>>>>>> another thread is holding the lock and isn't calling other X11
>>>>>>> functions at the moment. But I doubt this knowledge will be widely
>>>>>>> known and remembered in the future, and if another thread starts
>>>>>>> calling X11 routines, we'll get into trouble...
>>>>>>>
>>>>>>> Why would another thread (the AWT EventQueue if I got the problem
>>>>>>> right) hold the AWTLock and wait till the GL thread finishes its
>>>>>>> job?
>>>>>>> I'd assume it should release the lock for the period of waiting.
>>>>>>> This
>>>>>>> would allow the GL thread to acquire the lock and perform the
>>>>>>> XSync()
>>>>>>> call w/o any potential issues. Have you investigated this option?
>>>>>>>
>>>>>>> --
>>>>>>> best regards,
>>>>>>> Anthony
>>>>>>>
>>>>>>> On 06/28/2013 09:16 PM, Anton Litvinov wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Could you please review the following fix for a bug, which consists
>>>>>>>> in a
>>>>>>>> deadlock provoked by concurrency between AWT-EventQueue and Java2D
>>>>>>>> Queue
>>>>>>>> Flusher for taking AWT lock, when OpenGL rendering is enabled.
>>>>>>>>
>>>>>>>> Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015730
>>>>>>>> Webrev: http://cr.openjdk.java.net/~alitvinov/8015730/webrev.00
>>>>>>>>
>>>>>>>> The fix allows the code from the native function
>>>>>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" of the file
>>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" execute
>>>>>>>> all
>>>>>>>> XError handling procedures using "sun.awt.X11.XErrorHandlerUtil"
>>>>>>>> class
>>>>>>>> without acquiring AWT lock. It is the only available solution for
>>>>>>>> this
>>>>>>>> problem, because the current design of
>>>>>>>> "sun.java2d.opengl.OGLRenderQueue" class does not allow to take AWT
>>>>>>>> lock
>>>>>>>> in Java2D Queue Flusher thread without reaching a deadlock, since
>>>>>>>> all
>>>>>>>> calls to the method
>>>>>>>> "sun.java2d.opengl.OGLRenderQueue.flushAndInvokeNow(Runnable r)"
>>>>>>>> are
>>>>>>>> guarded by AWT lock.
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>> Anton
>>>>>>


More information about the awt-dev mailing list