<AWT Dev> [8] Review request for 8015730: PIT: On Linux, OGL=true and fbobject=false leads to deadlock or infinite loop
Anton Litvinov
anton.litvinov at oracle.com
Thu Jul 4 02:47:56 PDT 2013
Hello Artem and Anthony,
Thank you for review of this fix. I would like to inform you that by
request of Anthony a separate new issue about the problem with
synchronization between "sun.java2d.opengl.OGLRenderQueue" instance and
Java2D Queue Flusher thread, whose priority is P4, has been created. The
ID of the new CR is 8019923, and it is linked to the current bug 8015730.
Thank you,
Anton
On 7/4/2013 1:31 PM, Artem Ananiev wrote:
>
> On 7/3/2013 9:49 PM, Anthony Petrov wrote:
>> On 07/03/2013 08:35 PM, Artem Ananiev wrote:
>>>> src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c
>>>>> 398 XSync(awt_display, False);
>>>
>>> OK, got it.
>>>
>>>> All other usages of the EXEC_WITH_XERROR_HANDLER() pass TRUE to the
>>>> RESTORE_XERROR_HANDLER(), and hence call XSync(), too (in
>>>> XErrorHandlerUtil.java) with or without the current fix...
>>>>
>>> I looked to the webrev for 8005607 once again and found XSync() was
>>> there before that fix. And you're right, it's required to flush Xlib
>>> quene.
>>>
>>> OK, so I currently don't see a way to improve the current fix. It
>>> may be
>>> possible to rewrite OGL render queue synchronization, but it's out of
>>> scope of this fix.
>>
>> I agree that this is out of the scope of the current fix. However, the
>> very fact of XSync() being invoked w/o holding the AWTLock needs to be
>> addressed at some point. So I suggest to file a separate P4 bug to
>> investigate this later (probably in a future release).
>
> The lock is actually hold, but by another thread (which then is
> waiting for the flusher thread). As AWT lock is used to prevent access
> to X display simultaneously from multiple threads, not to conform to
> Java memory model, such a pattern seems to be valid. It's not trivial
> to understand, though, I agree.
>
> Thanks,
>
> Artem
>
>> --
>> best regards,
>> Anthony
>>
>>>
>>> Thanks,
>>>
>>> Artem
>>>
>>>> --
>>>> best regards,
>>>> Anthony
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Artem
>>>>>
>>>>>> --
>>>>>> best regards,
>>>>>> Anthony
>>>>>>
>>>>>> On 07/01/2013 07:24 PM, Anton Litvinov wrote:
>>>>>>> Hello Anthony,
>>>>>>>
>>>>>>> Thank you for the review of this fix. I would like to remark that
>>>>>>> this
>>>>>>> deadlock is a regression of the fix for the bug 8005607, and in the
>>>>>>> code
>>>>>>> of the file
>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>>>>>>> before 8005607 fix, where the previous XError handling mechanism
>>>>>>> not
>>>>>>> involving "sun.awt.X11.XErrorHandlerUtil" class was used, native
>>>>>>> "XSync()" function was called without acquiring of AWT lock. So
>>>>>>> a fix
>>>>>>> for the current bug with a deadlock just reverted a part of the fix
>>>>>>> 8005607 which enforced taking AWT lock from the function
>>>>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" in the file
>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c".
>>>>>>> Answers to
>>>>>>> your questions are provided below.
>>>>>>>
>>>>>>> 1. "AWT EventQueue" holds AWT lock and waits till "Java2D Queue
>>>>>>> Flusher"
>>>>>>> thread finishes its job, because in the method
>>>>>>> "sun.java2d.opengl.OGLSurfaceData.initSurface(final int width,
>>>>>>> final
>>>>>>> int
>>>>>>> height)" execution of "initSurfaceNow(int width, int height)" is
>>>>>>> dispatched to "Java2D Queue Flusher". Before this dispatching in
>>>>>>> the
>>>>>>> method "initSurface" AWT lock is taken by the lines
>>>>>>>
>>>>>>> 308 OGLRenderQueue rq = OGLRenderQueue.getInstance();
>>>>>>> 309 rq.lock();
>>>>>>>
>>>>>>> and then with held AWT lock "AWT EventQueue" thread starts
>>>>>>> waiting on
>>>>>>> the second lock "sun.java2d.opengl.OGLRenderQueue.flusher" in the
>>>>>>> method
>>>>>>> "sun.java2d.opengl.OGLRenderQueue.QueueFlusher.flushNow()"
>>>>>>>
>>>>>>> 181 wait();
>>>>>>>
>>>>>>> 2. Yes, I investigated the option of waiting on AWT lock instead of
>>>>>>> "sun.java2d.opengl.OGLRenderQueue.flusher" lock in the class
>>>>>>> "sun.java2d.opengl.OGLRenderQueue", but this is impossible, because
>>>>>>> access to the always running thread "Java2D Queue Flusher"
>>>>>>> should be
>>>>>>> synchronized on some lock other than AWT lock, otherwise there will
>>>>>>> be a
>>>>>>> performance degradation, because it will be trying to get AWT lock
>>>>>>> each
>>>>>>> 100 milliseconds. As I understood a possible solution for this
>>>>>>> problem
>>>>>>> can be not locking on AWT lock before dispatching execution of any
>>>>>>> code
>>>>>>> to "Java2D Queue Flusher" or complete refactoring of locking
>>>>>>> mechanism
>>>>>>> in the class "sun.java2d.opengl.OGLRenderQueue". Since the current
>>>>>>> bug
>>>>>>> blocks SQE from running any tests involving OpenGL and does not
>>>>>>> allow to
>>>>>>> run any Java GUI application with enabled OpenGL rendering on Linux
>>>>>>> OS,
>>>>>>> I suppose the variant of refactoring is not acceptable. That is
>>>>>>> why as
>>>>>>> the most secure solution I decided just to call XSync() from
>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" as
>>>>>>> it was
>>>>>>> before the fix for 8005607.
>>>>>>>
>>>>>>> Thank you,
>>>>>>> Anton
>>>>>>>
>>>>>>> On 7/1/2013 3:11 PM, Anthony Petrov wrote:
>>>>>>>> Hi Anton,
>>>>>>>>
>>>>>>>> I'm not sure if this a good fix since it enabled the GL thread to
>>>>>>>> call
>>>>>>>> Xlib APIs w/o acquiring the AWTLock. This may not present a
>>>>>>>> problem
>>>>>>>> currently since we know exactly when this method is called and
>>>>>>>> that
>>>>>>>> another thread is holding the lock and isn't calling other X11
>>>>>>>> functions at the moment. But I doubt this knowledge will be widely
>>>>>>>> known and remembered in the future, and if another thread starts
>>>>>>>> calling X11 routines, we'll get into trouble...
>>>>>>>>
>>>>>>>> Why would another thread (the AWT EventQueue if I got the problem
>>>>>>>> right) hold the AWTLock and wait till the GL thread finishes its
>>>>>>>> job?
>>>>>>>> I'd assume it should release the lock for the period of waiting.
>>>>>>>> This
>>>>>>>> would allow the GL thread to acquire the lock and perform the
>>>>>>>> XSync()
>>>>>>>> call w/o any potential issues. Have you investigated this option?
>>>>>>>>
>>>>>>>> --
>>>>>>>> best regards,
>>>>>>>> Anthony
>>>>>>>>
>>>>>>>> On 06/28/2013 09:16 PM, Anton Litvinov wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Could you please review the following fix for a bug, which
>>>>>>>>> consists
>>>>>>>>> in a
>>>>>>>>> deadlock provoked by concurrency between AWT-EventQueue and
>>>>>>>>> Java2D
>>>>>>>>> Queue
>>>>>>>>> Flusher for taking AWT lock, when OpenGL rendering is enabled.
>>>>>>>>>
>>>>>>>>> Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015730
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~alitvinov/8015730/webrev.00
>>>>>>>>>
>>>>>>>>> The fix allows the code from the native function
>>>>>>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" of the file
>>>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>>>>>>>>> execute
>>>>>>>>> all
>>>>>>>>> XError handling procedures using "sun.awt.X11.XErrorHandlerUtil"
>>>>>>>>> class
>>>>>>>>> without acquiring AWT lock. It is the only available solution for
>>>>>>>>> this
>>>>>>>>> problem, because the current design of
>>>>>>>>> "sun.java2d.opengl.OGLRenderQueue" class does not allow to
>>>>>>>>> take AWT
>>>>>>>>> lock
>>>>>>>>> in Java2D Queue Flusher thread without reaching a deadlock, since
>>>>>>>>> all
>>>>>>>>> calls to the method
>>>>>>>>> "sun.java2d.opengl.OGLRenderQueue.flushAndInvokeNow(Runnable r)"
>>>>>>>>> are
>>>>>>>>> guarded by AWT lock.
>>>>>>>>>
>>>>>>>>> Thank you,
>>>>>>>>> Anton
>>>>>>>
More information about the awt-dev
mailing list