<AWT Dev> [8] Review request for 8015730: PIT: On Linux, OGL=true and fbobject=false leads to deadlock or infinite loop

Artem Ananiev artem.ananiev at oracle.com
Thu Jul 4 03:53:43 PDT 2013


On 7/4/2013 1:47 PM, Anton Litvinov wrote:
> Hello Artem and Anthony,
>
> Thank you for review of this fix. I would like to inform you that by
> request of Anthony a separate new issue about the problem with
> synchronization between "sun.java2d.opengl.OGLRenderQueue" instance and
> Java2D Queue Flusher thread, whose priority is P4, has been created. The
> ID of the new CR is 8019923, and it is linked to the current bug 8015730.

OK, I'm fine with the current fix then.

Thanks,

Artem

> Thank you,
> Anton
>
> On 7/4/2013 1:31 PM, Artem Ananiev wrote:
>>
>> On 7/3/2013 9:49 PM, Anthony Petrov wrote:
>>> On 07/03/2013 08:35 PM, Artem Ananiev wrote:
>>>>> src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c
>>>>>>  398     XSync(awt_display, False);
>>>>
>>>> OK, got it.
>>>>
>>>>> All other usages of the EXEC_WITH_XERROR_HANDLER() pass TRUE to the
>>>>> RESTORE_XERROR_HANDLER(), and hence call XSync(), too (in
>>>>> XErrorHandlerUtil.java) with or without the current fix...
>>>>>
>>>> I looked to the webrev for 8005607 once again and found XSync() was
>>>> there before that fix. And you're right, it's required to flush Xlib
>>>> quene.
>>>>
>>>> OK, so I currently don't see a way to improve the current fix. It
>>>> may be
>>>> possible to rewrite OGL render queue synchronization, but it's out of
>>>> scope of this fix.
>>>
>>> I agree that this is out of the scope of the current fix. However, the
>>> very fact of XSync() being invoked w/o holding the AWTLock needs to be
>>> addressed at some point. So I suggest to file a separate P4 bug to
>>> investigate this later (probably in a future release).
>>
>> The lock is actually hold, but by another thread (which then is
>> waiting for the flusher thread). As AWT lock is used to prevent access
>> to X display simultaneously from multiple threads, not to conform to
>> Java memory model, such a pattern seems to be valid. It's not trivial
>> to understand, though, I agree.
>>
>> Thanks,
>>
>> Artem
>>
>>> --
>>> best regards,
>>> Anthony
>>>
>>>>
>>>> Thanks,
>>>>
>>>> Artem
>>>>
>>>>> --
>>>>> best regards,
>>>>> Anthony
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Artem
>>>>>>
>>>>>>> --
>>>>>>> best regards,
>>>>>>> Anthony
>>>>>>>
>>>>>>> On 07/01/2013 07:24 PM, Anton Litvinov wrote:
>>>>>>>> Hello Anthony,
>>>>>>>>
>>>>>>>> Thank you for the review of this fix. I would like to remark that
>>>>>>>> this
>>>>>>>> deadlock is a regression of the fix for the bug 8005607, and in the
>>>>>>>> code
>>>>>>>> of the file
>>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>>>>>>>> before 8005607 fix, where the previous XError handling mechanism
>>>>>>>> not
>>>>>>>> involving "sun.awt.X11.XErrorHandlerUtil" class was used, native
>>>>>>>> "XSync()" function was called without acquiring of AWT lock. So
>>>>>>>> a fix
>>>>>>>> for the current bug with a deadlock just reverted a part of the fix
>>>>>>>> 8005607 which enforced taking AWT lock from the function
>>>>>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" in the file
>>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c".
>>>>>>>> Answers to
>>>>>>>> your questions are provided below.
>>>>>>>>
>>>>>>>> 1. "AWT EventQueue" holds AWT lock and waits till "Java2D Queue
>>>>>>>> Flusher"
>>>>>>>> thread finishes its job, because in the method
>>>>>>>> "sun.java2d.opengl.OGLSurfaceData.initSurface(final int width,
>>>>>>>> final
>>>>>>>> int
>>>>>>>> height)" execution of "initSurfaceNow(int width, int height)" is
>>>>>>>> dispatched to "Java2D Queue Flusher". Before this dispatching in
>>>>>>>> the
>>>>>>>> method "initSurface" AWT lock is taken by the lines
>>>>>>>>
>>>>>>>>      308        OGLRenderQueue rq = OGLRenderQueue.getInstance();
>>>>>>>>      309        rq.lock();
>>>>>>>>
>>>>>>>> and then with held AWT lock "AWT EventQueue" thread starts
>>>>>>>> waiting on
>>>>>>>> the second lock "sun.java2d.opengl.OGLRenderQueue.flusher" in the
>>>>>>>> method
>>>>>>>> "sun.java2d.opengl.OGLRenderQueue.QueueFlusher.flushNow()"
>>>>>>>>
>>>>>>>>      181                    wait();
>>>>>>>>
>>>>>>>> 2. Yes, I investigated the option of waiting on AWT lock instead of
>>>>>>>> "sun.java2d.opengl.OGLRenderQueue.flusher" lock in the class
>>>>>>>> "sun.java2d.opengl.OGLRenderQueue", but this is impossible, because
>>>>>>>> access to the always running thread "Java2D Queue Flusher"
>>>>>>>> should be
>>>>>>>> synchronized on some lock other than AWT lock, otherwise there will
>>>>>>>> be a
>>>>>>>> performance degradation, because it will be trying to get AWT lock
>>>>>>>> each
>>>>>>>> 100 milliseconds. As I understood a possible solution for this
>>>>>>>> problem
>>>>>>>> can be not locking on AWT lock before dispatching execution of any
>>>>>>>> code
>>>>>>>> to "Java2D Queue Flusher" or complete refactoring of locking
>>>>>>>> mechanism
>>>>>>>> in the class "sun.java2d.opengl.OGLRenderQueue". Since the current
>>>>>>>> bug
>>>>>>>> blocks SQE from running any tests involving OpenGL and does not
>>>>>>>> allow to
>>>>>>>> run any Java GUI application with enabled OpenGL rendering on Linux
>>>>>>>> OS,
>>>>>>>> I suppose the variant of refactoring is not acceptable. That is
>>>>>>>> why as
>>>>>>>> the most secure solution I decided just to call XSync() from
>>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" as
>>>>>>>> it was
>>>>>>>> before the fix for 8005607.
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>> Anton
>>>>>>>>
>>>>>>>> On 7/1/2013 3:11 PM, Anthony Petrov wrote:
>>>>>>>>> Hi Anton,
>>>>>>>>>
>>>>>>>>> I'm not sure if this a good fix since it enabled the GL thread to
>>>>>>>>> call
>>>>>>>>> Xlib APIs w/o acquiring the AWTLock. This may not present a
>>>>>>>>> problem
>>>>>>>>> currently since we know exactly when this method is called and
>>>>>>>>> that
>>>>>>>>> another thread is holding the lock and isn't calling other X11
>>>>>>>>> functions at the moment. But I doubt this knowledge will be widely
>>>>>>>>> known and remembered in the future, and if another thread starts
>>>>>>>>> calling X11 routines, we'll get into trouble...
>>>>>>>>>
>>>>>>>>> Why would another thread (the AWT EventQueue if I got the problem
>>>>>>>>> right) hold the AWTLock and wait till the GL thread finishes its
>>>>>>>>> job?
>>>>>>>>> I'd assume it should release the lock for the period of waiting.
>>>>>>>>> This
>>>>>>>>> would allow the GL thread to acquire the lock and perform the
>>>>>>>>> XSync()
>>>>>>>>> call w/o any potential issues. Have you investigated this option?
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> best regards,
>>>>>>>>> Anthony
>>>>>>>>>
>>>>>>>>> On 06/28/2013 09:16 PM, Anton Litvinov wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Could you please review the following fix for a bug, which
>>>>>>>>>> consists
>>>>>>>>>> in a
>>>>>>>>>> deadlock provoked by concurrency between AWT-EventQueue and
>>>>>>>>>> Java2D
>>>>>>>>>> Queue
>>>>>>>>>> Flusher for taking AWT lock, when OpenGL rendering is enabled.
>>>>>>>>>>
>>>>>>>>>> Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015730
>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~alitvinov/8015730/webrev.00
>>>>>>>>>>
>>>>>>>>>> The fix allows the code from the native function
>>>>>>>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" of the file
>>>>>>>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>>>>>>>>>> execute
>>>>>>>>>> all
>>>>>>>>>> XError handling procedures using "sun.awt.X11.XErrorHandlerUtil"
>>>>>>>>>> class
>>>>>>>>>> without acquiring AWT lock. It is the only available solution for
>>>>>>>>>> this
>>>>>>>>>> problem, because the current design of
>>>>>>>>>> "sun.java2d.opengl.OGLRenderQueue" class does not allow to
>>>>>>>>>> take AWT
>>>>>>>>>> lock
>>>>>>>>>> in Java2D Queue Flusher thread without reaching a deadlock, since
>>>>>>>>>> all
>>>>>>>>>> calls to the method
>>>>>>>>>> "sun.java2d.opengl.OGLRenderQueue.flushAndInvokeNow(Runnable r)"
>>>>>>>>>> are
>>>>>>>>>> guarded by AWT lock.
>>>>>>>>>>
>>>>>>>>>> Thank you,
>>>>>>>>>> Anton
>>>>>>>>
>


More information about the awt-dev mailing list