RFR: 8358343: [leyden] Drop notify_all in CompilationPolicyUtils::Queue::pop [v2]
Igor Veresov
iveresov at openjdk.org
Fri Jun 6 08:37:04 UTC 2025
On Fri, 6 Jun 2025 07:34:46 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> Found this when reading premain-vs-mainline webrev. Mainline does not have `notify_all` in this method:
>> https://github.com/openjdk/jdk/blob/c382da579884c28f2765b2c6ba68c0ad4fdcb2ce/src/hotspot/share/compiler/compilationPolicy.hpp#L85-L92
>>
>> But if you remove `notify_all()` in `premain`, then tests start to deadlock, see bug for a sample. The culprit is `CompilationPolicy::flush_replay_training_at_init`, which is only present in premain. I fixed it by using timed waits, which obviates the need for extra notifications. We only enter this method with `-XX:+AOTVerifyTrainingData`, so we don't care much about its performance. This is IMO better than doing a questionable `notify_all` followed by `wait` in load-bearing code.
>>
>> Additional testing:
>> - [x] Linux x86_64 server fastdebug, `runtime/cds` (5x, no timeouts yet; still running more iterations)
>
> Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:
>
> - Merge branch 'premain' into JDK-8358343-leyden-training-notify-all
> - Fix
Actually the current approach (even with the spin-wait) and my solution too are not really correct. The fact that the queue is empty doesn't mean that every last item has been processed. The last item may have been popped, but still is being worked on. So however you look at it we should set up some kind of a handshake to make sure the replay thread is done processing, not just done popping.
-------------
PR Comment: https://git.openjdk.org/leyden/pull/74#issuecomment-2948506858
More information about the leyden-dev
mailing list