RFR: 8280468: Crashes in getConfigColormap, getConfigVisualId, XVisualIDFromVisual on Linux
Maxim Kartashev
duke at openjdk.java.net
Mon Jan 24 07:52:07 UTC 2022
On Fri, 21 Jan 2022 17:02:38 GMT, Maxim Kartashev <duke at openjdk.java.net> wrote:
> These crashes were not reproducible, so the fix is based on a hypothesis that there are two possible reasons for them:
> 1. `makeDefaultConfig()` returning `NULL`.
> 2. A race condition when the number of screens changes.
> The race scenario: `X11GraphisDevice.makeDefaultConfiguration()` is called on EDT so the call can race with `X11GraphisDevice.invalidate()` that re-sets the screen number of the device; the latter is invoked on the `AWT-XAWT` thread from `X11GraphicsEnvironment.rebuildDevices()`. So by the time `makeDefaultConfiguration()` makes a native call with the device's current screen number, the `x11Screens` array maintained by the native code could have become shorter. And the native methods like `Java_sun_awt_X11GraphicsDevice_getConfigColormap()` assume that the number passed to them is always current and valid. The AWT lock only protects against the changes during the native methods invocation and does not protect against them being called with an outdated screen number. With a larger screen number, those methods read past the end of the `x11Screens` array.
>
> The fix for (1) is to die gracefully instead of crashing in an attempt to de-reference a `NULL` pointer, which might happen upon returning from `makeDefaultConfig()` in `getAllConfigs()`.
>
> The fix for (2) is to eliminate the race by protecting `X11GraphisDevice.screen` with the AWT lock such that it doesn't change when the native methods working with it are active.
>
> We've been shipping JetBrains Runtime with this fix for a few months now and there were no crash reports with those specific patterns against the versions with the fix.
Thanks for looking at this!
> You can try to reproduce the bug by creating the custom JDK and emulating the native upcalls by the test. It will help to prve that the fix will work.
Can you elaborate, please? Do you mean that this custom JDK would synthetically change `X11GraphicsDevice.screen` at the "right" time?
>
> I am not sure that it is possible to synchronize access to the screen number across the code w/o introducing deadlocks, the current change does not cover all cases where the X11GraphicsDevice.screen is passed around. probably we should follow this [suggestion](https://github.com/openjdk/jdk/blob/6765f902505fbdd02f25b599f942437cd805cad1/src/java.desktop/unix/classes/sun/awt/X11GraphicsDevice.java#L58) and everywhere the "screen" is used under the awt lock (1)check that it is less than the number of screens and use 0(main screen as a fallback), or (2) some "default value" if the number of screens is zero at that time - something similar was implemented on macOS by the [JDK-8211992](http://hg.openjdk.java.net/jdk/jdk/rev/814c49afb1a7)
I agree, returning some default as a fallback should also work. However, that's so radically different from the approach I've taken that a separate PR would be required.
-------------
PR: https://git.openjdk.java.net/jdk/pull/7182
More information about the client-libs-dev
mailing list