<AWT Dev> [8] Review request for 8015730: PIT: On Linux, OGL=true and fbobject=false leads to deadlock or infinite loop

Anton Litvinov anton.litvinov at oracle.com
Tue Jul 2 02:32:24 PDT 2013


Hello Anthony,

Yes, sure, let's wait for Artem's evaluation of the fix. Yes, probably, 
a new bug which consists in inability to take AWT lock without a 
deadlock from "Runnable()" instance which is executed by the method 
"sun.java2d.opengl.OGLRenderQueue.flushAndInvokeNow(Runnable r)". But 
currently I am trying to identify any existing issue in our internal 
bug-tracking system, whose cause would be similar.

Thank you,
Anton

On 7/1/2013 8:50 PM, Anthony Petrov wrote:
> Thanks for the additional information, Anton. Since this fix simply 
> reverts the behavior in GLXSurfaceData.c back to the pre-8005607 era, 
> it could probably be considered a good interim solution for the problem.
>
> I'd like to hear Artem's opinion on this, though. Should we file a P4 
> bug to investigate the issue further so that in the future we could 
> avoid calling XSync() w/o the AWTLock?
>
> -- 
> best regards,
> Anthony
>
> On 07/01/2013 07:24 PM, Anton Litvinov wrote:
>> Hello Anthony,
>>
>> Thank you for the review of this fix. I would like to remark that this
>> deadlock is a regression of the fix for the bug 8005607, and in the code
>> of the file "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c"
>> before 8005607 fix, where the previous XError handling mechanism not
>> involving "sun.awt.X11.XErrorHandlerUtil" class was used, native
>> "XSync()" function was called without acquiring of AWT lock. So a fix
>> for the current bug with a deadlock just reverted a part of the fix
>> 8005607 which enforced taking AWT lock from the function
>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" in the file
>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c". Answers to
>> your questions are provided below.
>>
>> 1. "AWT EventQueue" holds AWT lock and waits till "Java2D Queue Flusher"
>> thread finishes its job, because in the method
>> "sun.java2d.opengl.OGLSurfaceData.initSurface(final int width, final int
>> height)" execution of "initSurfaceNow(int width, int height)" is
>> dispatched to "Java2D Queue Flusher". Before this dispatching in the
>> method "initSurface" AWT lock is taken by the lines
>>
>>      308        OGLRenderQueue rq = OGLRenderQueue.getInstance();
>>      309        rq.lock();
>>
>> and then with held AWT lock "AWT EventQueue" thread starts waiting on
>> the second lock "sun.java2d.opengl.OGLRenderQueue.flusher" in the method
>> "sun.java2d.opengl.OGLRenderQueue.QueueFlusher.flushNow()"
>>
>>      181                    wait();
>>
>> 2. Yes, I investigated the option of waiting on AWT lock instead of
>> "sun.java2d.opengl.OGLRenderQueue.flusher" lock in the class
>> "sun.java2d.opengl.OGLRenderQueue", but this is impossible, because
>> access to the always running thread "Java2D Queue Flusher" should be
>> synchronized on some lock other than AWT lock, otherwise there will be a
>> performance degradation, because it will be trying to get AWT lock each
>> 100 milliseconds. As I understood a possible solution for this problem
>> can be not locking on AWT lock before dispatching execution of any code
>> to "Java2D Queue Flusher" or complete refactoring of locking mechanism
>> in the class "sun.java2d.opengl.OGLRenderQueue". Since the current bug
>> blocks SQE from running any tests involving OpenGL and does not allow to
>> run any Java GUI application with enabled OpenGL rendering on Linux OS,
>> I suppose the variant of refactoring is not acceptable. That is why as
>> the most secure solution I decided just to call XSync() from
>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" as it was
>> before the fix for 8005607.
>>
>> Thank you,
>> Anton
>>
>> On 7/1/2013 3:11 PM, Anthony Petrov wrote:
>>> Hi Anton,
>>>
>>> I'm not sure if this a good fix since it enabled the GL thread to call
>>> Xlib APIs w/o acquiring the AWTLock. This may not present a problem
>>> currently since we know exactly when this method is called and that
>>> another thread is holding the lock and isn't calling other X11
>>> functions at the moment. But I doubt this knowledge will be widely
>>> known and remembered in the future, and if another thread starts
>>> calling X11 routines, we'll get into trouble...
>>>
>>> Why would another thread (the AWT EventQueue if I got the problem
>>> right) hold the AWTLock and wait till the GL thread finishes its job?
>>> I'd assume it should release the lock for the period of waiting. This
>>> would allow the GL thread to acquire the lock and perform the XSync()
>>> call w/o any potential issues. Have you investigated this option?
>>>
>>> -- 
>>> best regards,
>>> Anthony
>>>
>>> On 06/28/2013 09:16 PM, Anton Litvinov wrote:
>>>> Hello,
>>>>
>>>> Could you please review the following fix for a bug, which consists 
>>>> in a
>>>> deadlock provoked by concurrency between AWT-EventQueue and Java2D 
>>>> Queue
>>>> Flusher for taking AWT lock, when OpenGL rendering is enabled.
>>>>
>>>> Bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8015730
>>>> Webrev: http://cr.openjdk.java.net/~alitvinov/8015730/webrev.00
>>>>
>>>> The fix allows the code from the native function
>>>> "Java_sun_java2d_opengl_GLXSurfaceData_initPbuffer" of the file
>>>> "jdk/src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c" execute 
>>>> all
>>>> XError handling procedures using "sun.awt.X11.XErrorHandlerUtil" class
>>>> without acquiring AWT lock. It is the only available solution for this
>>>> problem, because the current design of
>>>> "sun.java2d.opengl.OGLRenderQueue" class does not allow to take AWT 
>>>> lock
>>>> in Java2D Queue Flusher thread without reaching a deadlock, since all
>>>> calls to the method
>>>> "sun.java2d.opengl.OGLRenderQueue.flushAndInvokeNow(Runnable r)" are
>>>> guarded by AWT lock.
>>>>
>>>> Thank you,
>>>> Anton
>>



More information about the awt-dev mailing list