RFR: 8280468: Crashes in getConfigColormap, getConfigVisualId, XVisualIDFromVisual on Linux

Phil Race prr at openjdk.java.net
Sun Feb 6 21:13:04 UTC 2022


On Fri, 21 Jan 2022 17:02:38 GMT, Maxim Kartashev <duke at openjdk.java.net> wrote:

> These crashes were not reproducible, so the fix is based on a hypothesis that there are two possible reasons for them:
> 1. `makeDefaultConfig()` returning `NULL`.
> 2. A race condition when the number of screens changes.
> The race scenario: `X11GraphisDevice.makeDefaultConfiguration()` is called on EDT so the call can race with `X11GraphisDevice.invalidate()` that re-sets the screen number of the device; the latter is invoked on the `AWT-XAWT` thread from `X11GraphicsEnvironment.rebuildDevices()`. So by the time `makeDefaultConfiguration()` makes a native call with the device's current screen number, the `x11Screens` array maintained by the native code could have become shorter. And the native methods like `Java_sun_awt_X11GraphicsDevice_getConfigColormap()` assume that the number passed to them is always current and valid. The AWT lock only protects against the changes during the native methods invocation and does not protect against them being called with an outdated screen number. With a larger screen number, those methods read past the end of the `x11Screens` array.
> 
> The fix for (1) is to die gracefully instead of crashing in an attempt to de-reference a `NULL` pointer, which might happen upon returning from `makeDefaultConfig()` in `getAllConfigs()`.
> 
> The fix for (2) is to eliminate the race by protecting `X11GraphisDevice.screen` with the AWT lock such that it doesn't change when the native methods working with it are active.
> 
> We've been shipping JetBrains Runtime with this fix for a few months now and there were no crash reports with those specific patterns against the versions with the fix.

"probably we should follow this [suggestion](https://github.com/openjdk/jdk/blob/6765f902505fbdd02f25b599f942437cd805cad1/src/java.desktop/unix/classes/sun/awt/X11GraphicsDevice.java#L58) and everywhere the "screen" is used under the awt lock (1)check that it is less than the number of screens and use 0(main screen as a fallback),"

Even if "everywhere" is too hard, can't we just do this in the native methods where the crashes are being seen ?
Then no locks have to change .. and we can be more targeted to these known, but not reproducible, crashes.

If you can force a crash in one of these then prove you can avoid it that would help too.
As it is this fix is not really "reviewable" in the sense that no amount of staring at the code is likely to be enough.

-------------

PR: https://git.openjdk.java.net/jdk/pull/7182



More information about the client-libs-dev mailing list