RFR: 8266310: deadlock while loading the JNI code

Chris Hegarty chegar at openjdk.java.net
Thu May 13 11:08:15 UTC 2021


On Tue, 11 May 2021 13:10:30 GMT, Aleksei Voitylov <avoitylov at openjdk.org> wrote:

> Please review this PR which fixes the deadlock in ClassLoader between the two lock objects - a lock object associated with the class being loaded, and the ClassLoader.loadedLibraryNames hash map, locked during the native library load operation.
> 
> Problem being fixed:
> 
> The initial reproducer demonstrated a deadlock between the JarFile/ZipFile and the hash map. That deadlock exists even when the ZipFile/JarFile lock is removed because there's another lock object in the class loader, associated with the name of the class being loaded. Such objects are stored in ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads exactly the same class, whose signature is being verified in another thread.
> 
> Proposed fix:
> 
> The proposed patch suggests to get rid of locking loadedLibraryNames hash map and synchronize on each entry name, as it's done with class names in see ClassLoader.getClassLoadingLock(name) method.
> 
> The patch introduces nativeLibraryLockMap which holds the lock objects for each library name, and the getNativeLibraryLock() private method is used to lazily initialize the corresponding lock object. nativeLibraryContext was changed to ThreadLocal, so that in any concurrent thread it would have a NativeLibrary object on top of the stack, that's being currently loaded/unloaded in that thread. nativeLibraryLockMap accumulates the names of all native libraries loaded - in line with class loading code, it is not explicitly cleared.
> 
> Testing:  jtreg and jck testing with no regressions. A new regression test was developed.

Hi Aleksei, 

As you may know, I looked into a similar issue recently and put together a reproducer [1] (which is probably similar to what you have). The reproducer, run at the time against Oracle 11u, demonstrates the issue on the mainline too, but the deadlock is slightly different.   The reason I mention it here is that the reproducer encounters the issue whether there is an attempt to load the same class or another class ( from the same jar file ).  In fact, the issue is even more general, the problem is with trying to load a class, not already loaded, from a jar further down on the class path ( the class may not even exist, just that it causes the loader to walk the class path up to the jar being verified ).

I filed an issue for this, which may need to be closed as a duplicate depending on the outcome of this PR [2].

[1] https://github.com/ChrisHegarty/deadlock
[2] https://bugs.openjdk.java.net/browse/JDK-8266350

-------------

PR: https://git.openjdk.java.net/jdk/pull/3976


More information about the core-libs-dev mailing list