RFR: 8266310: deadlock while loading the JNI code [v3]
Jorn Vernee
jvernee at openjdk.java.net
Wed May 19 18:50:40 UTC 2021
On Wed, 19 May 2021 16:29:33 GMT, Aleksei Voitylov <avoitylov at openjdk.org> wrote:
>> Please review this PR which fixes the deadlock in ClassLoader between the two lock objects - a lock object associated with the class being loaded, and the ClassLoader.loadedLibraryNames hash map, locked during the native library load operation.
>>
>> Problem being fixed:
>>
>> The initial reproducer demonstrated a deadlock between the JarFile/ZipFile and the hash map. That deadlock exists even when the ZipFile/JarFile lock is removed because there's another lock object in the class loader, associated with the name of the class being loaded. Such objects are stored in ClassLoader.parallelLockMap. The deadlock occurs when JNI_OnLoad() loads exactly the same class, whose signature is being verified in another thread.
>>
>> Proposed fix:
>>
>> The proposed patch suggests to get rid of locking loadedLibraryNames hash map and synchronize on each entry name, as it's done with class names in see ClassLoader.getClassLoadingLock(name) method.
>>
>> The patch introduces nativeLibraryLockMap which holds the lock objects for each library name, and the getNativeLibraryLock() private method is used to lazily initialize the corresponding lock object. nativeLibraryContext was changed to ThreadLocal, so that in any concurrent thread it would have a NativeLibrary object on top of the stack, that's being currently loaded/unloaded in that thread. nativeLibraryLockMap accumulates the names of all native libraries loaded - in line with class loading code, it is not explicitly cleared.
>>
>> Testing: jtreg and jck testing with no regressions. A new regression test was developed.
>
> Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision:
>
> fix trailing whitespace
src/java.base/share/classes/jdk/internal/loader/NativeLibraries.java line 262:
> 260: } finally {
> 261: releaseNativeLibraryLock(name);
> 262: }
The new locking scheme looks incorrect to me. It now seems possible for 2 different class loaders in 2 different threads to load the same library (which should not be possible).
Thread 1:
- acquires name lock
- checks library name is already in `loadedLibraryNames` (it's not)
- release name lock
- start loading library
Then thread 2 comes in and does the same
Then again thread 1 finishes loading and:
- acquires name lock
- puts library name in `loadedLibraryNames`
- puts library name in it's own `libraries`
- release lock
Then thread 2 comes in and does the same again (although adding the name to `loadedLibraryNames` will silently return `false`).
It seems that this scenario is possible, and the library will be loaded by 2 different class loaders, each with their own `lib` object. (If there's a race, there has to be a loser as well, which would need to discard their work when finding out they lost)
You might be able to stress this by checking if `loadedLibraryNames.add(name);` returns `true`, and throwing an exception if not.
I think a possible solution would be to claim the library name up front for a particular loader. Then 2 threads can still race to load the same library for the same class loader, but 2 threads with 2 different class loaders racing to load the library should not be possible.
Actually, you might be able to prevent a race on JNI_OnLoad altogether by claiming the library name for a particular thread upfront as well (e.g. using a `ConcurrentHashMap<String, Thread>`).
-------------
PR: https://git.openjdk.java.net/jdk/pull/3976
More information about the core-libs-dev
mailing list