RFR: 8280468: Crashes in getConfigColormap, getConfigVisualId, XVisualIDFromVisual on Linux [v3]

Sergey Bylokhov serb at openjdk.java.net
Thu Apr 7 23:08:11 UTC 2022


On Tue, 5 Apr 2022 05:42:53 GMT, Maxim Kartashev <duke at openjdk.java.net> wrote:

>> These crashes were not reproducible, so the fix is based on a hypothesis that there are two possible reasons for them:
>> 1. `makeDefaultConfig()` returning `NULL`.
>> 2. A race condition when the number of screens changes.
>> The race scenario: `X11GraphisDevice.makeDefaultConfiguration()` is called on EDT so the call can race with `X11GraphisDevice.invalidate()` that re-sets the screen number of the device; the latter is invoked on the `AWT-XAWT` thread from `X11GraphicsEnvironment.rebuildDevices()`. So by the time `makeDefaultConfiguration()` makes a native call with the device's current screen number, the `x11Screens` array maintained by the native code could have become shorter. And the native methods like `Java_sun_awt_X11GraphicsDevice_getConfigColormap()` assume that the number passed to them is always current and valid. The AWT lock only protects against the changes during the native methods invocation and does not protect against them being called with an outdated screen number. With a larger screen number, those methods read past the end of the `x11Screens` array.
>> 
>> The fix for (1) is to die gracefully instead of crashing in an attempt to de-reference a `NULL` pointer, which might happen upon returning from `makeDefaultConfig()` in `getAllConfigs()`.
>> 
>> The fix for (2) is to eliminate the race by protecting `X11GraphisDevice.screen` with the AWT lock such that it doesn't change when the native methods working with it are active.
>> 
>> We've been shipping JetBrains Runtime with this fix for a few months now and there were no crash reports with those specific patterns against the versions with the fix.
>
> Maxim Kartashev has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits:
> 
>  - Merge branch 'master' into JDK-8280468
>  - Addressed PR comments
>    
>    1. Got rid of X11GraphicsDevice.configLock in favour of the AWT lock.
>    2. Reduced the use of the screen number (and, consequently, AWT lock
>    contention) in XCanvasPeer.
>  - 8280468: Crashes in getConfigColormap, getConfigVisualId, XVisualIDFromVisual on Linux
>    
>    This fix addresses two possible causes of crashes:
>    1. makeDefaultConfig() returning NULL. The fix is to die gracefully
>    instead of crashing in an attempt to de-reference a NULL pointer.
>    
>    2. A race condition when the number of screens change (this is only an
>    issue with the number of xinerama screens; the number of X11 screens is
>    static for the duration of the X session).
>    
>    The race scenario: X11GraphisDevice.makeDefaultConfiguration() is called
>    on EDT so the call can race with X11GraphisDevice.invalidate() that
>    re-sets the screen number of the device; the latter is invoked on the
>    "AWT-XAWT" thread from X11GraphicsEnvironment.rebuildDevices(). So by
>    the time makeDefaultConfiguration() makes a native call with the
>    device's current screen number, the x11Screens array maintained by the
>    native code could have become shorter. And the native methods like
>    Java_sun_awt_X11GraphicsDevice_getConfigColormap() assume that the
>    screen number passed to them is always current and valid. The AWT lock
>    only protects against the changes during the native methods invocation,
>    but does not protect against them being called with an outdated screen
>    number.
>    
>    The fix is to eliminate the race by protecting X11GraphisDevice.screen
>    with the AWT lock.

src/java.desktop/unix/classes/sun/awt/X11/XCanvasPeer.java line 90:

> 88:             XToolkit.awtUnlock();
> 89:         }
> 90:     }

I am not sure I understand the purpose of this method, it requests the current device for the GC, then requests the screen index of the device, and then request device by the screen index, what is the reason to do that? Why we cannot just use the device from the GC?

src/java.desktop/unix/classes/sun/awt/X11GraphicsDevice.java line 161:

> 159: 
> 160:     private void makeConfigurations() {
> 161:         if (configs == null) {

To make the diff less distruptive you can move the ssyncronisation to the getConfigurations and getDefaultConfiguration where we have kind of DCL.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7182



More information about the client-libs-dev mailing list