RFR: 8231269: CompileTask::is_unloaded is slow due to JNIHandles type checks [v11]
Aleksey Shipilev
shade at openjdk.org
Fri May 9 17:08:42 UTC 2025
On Wed, 30 Apr 2025 07:23:39 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> [JDK-8163511](https://bugs.openjdk.org/browse/JDK-8163511) made the `CompileTask` improvement to avoid blocking class unloading if a relevant compile task is in queue. Current code does a sleight-of-hand to make sure the the `method*` in `CompileTask` are still valid before using them. Still a noble goal, so we keep trying to do this.
>>
>> The code tries to switch weak JNI handle with a strong one when it wants to capture the holder to block unloading. Since we are reusing the same field, we have to do type checks like `JNIHandles::is_weak_global_handle(_method_holder)`. Unfortunately, that type-check goes all the way to `OopStorage` allocation code to verify the handle is really allocated in the relevant `OopStorage`. This takes internal `OopStorage` locks, and thus is slow.
>>
>> This issue is clearly visible in Leyden, when there are lots of `CompileTask`-s in the queue, dumped by AOT code loader. It also does not help that `CompileTask::select_task` is effectively quadratic in number of methods in queue, so we end up calling `CompileTask::is_unloaded` very often.
>>
>> It is possible to mitigate this issue by splitting the related fields into weak and strong ones. But as Kim mentions in the bug, we should not be using JNI handles here at all, and instead go directly for relevant `OopStorage`-s. This is what this PR does, among other things that should hopefully make the whole mechanics clearer.
>>
>> Additional testing:
>> - [x] Linux x86_64 server fastdebug, `compiler/classUnloading`, 100x still passes; these tests are sensitive to bugs in this code
>> - [x] Linux x86_64 server fastdebug, `all`
>> - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>
> Move to oops
So... Following up on one forgotten `methodHandle` removal (https://github.com/openjdk/jdk/pull/24018#discussion_r2081467353) got me into a rabbit hole of making new utility class thread-safe. Otherwise, there are TOCTOU issues checking `(Weak)Handle` status, which gets us in trouble real quick. This is normally happens in current tests when external thread goes into `CompilerBroker::wait_for_compilation()` and compiler thread starts moving the `UMH` state for compilation. Relying on un-synchronized `Weak(Handle)` state is not nice either.
The answer to all these problems is to track the `UMH` state more accurately, and thus trust `WeakHandle` only sporadically. This is now done in new commit. This also allows for more explicit state checks. And, this allows clearly catching when we try to access `method()` after `release()` -- that is surprisingly happens for `hot_method()` that is not re-initialized always.
Chasing this bug also made my head hurt a bit about double-negating `!is_unloaded` checks. It is technically a safety check, so I renamed methods to reflect that: `is_safe`, `make_always_safe`.
I will schedule weekend tests for this PR on various machines to see if more bugs fall out once I shake that particular tree.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24018#issuecomment-2867309949
More information about the hotspot-gc-dev
mailing list