RFR: 8361752: Double free in CompileQueue::delete_all after JDK-8357473 [v2]
Vladimir Kozlov
kvn at openjdk.org
Fri Jul 18 17:58:50 UTC 2025
On Tue, 15 Jul 2025 08:59:17 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> See the bug for more analysis.
>>
>> The short summary is that `CompileQueue::delete_all` walks the entire compile queue and deletes the tasks. It normally goes smoothly, unless there are blocking tasks. Then, the actual waiters have to delete the task, lest we delete the task under waiter's feet. Full deletion and blocking waits coordinate with `waiting_for_completion_count` counter. This mechanism -- added by [JDK-8343938](https://bugs.openjdk.org/browse/JDK-8343938) in JDK 25 to solve a similar problem -- almost works. _Almost_.
>>
>> There is a subtle race window, where blocking waiter could have already unparked, dropped `waiting_for_completion_count` to `0` and proceeded to delete the task, see `CompileBroker::wait_for_completion()`. Then the queue deletion code could assume there are _no actual waiters_ on the blocking task, and proceed to delete the task _again_. Before [JDK-8357473](https://bugs.openjdk.org/browse/JDK-8357473) this race was fairly innocuous, as second attempt at insertion into the free list was benign. But now, `CompileTask`-s are `delete`-d, and the second attempt leads to double free.
>>
>> I suspect we can fix that by complicating the coordination protocol even further, e.g. by tracking the counters more thoroughly. But, recognizing `CompileQueue::delete_all()` is basically only called from the compiler shutdown code (things are already bad), and it looks completely opportunistic (it does not delete the whole compiler _threads_, so skipping synchronous deletes on a few compile tasks are not a big deal), we should strive to simplify it.
>>
>> This PR summarily delegates _all_ blocking task deletes to waiters. I think it stands to reason (and can be seen in `CompilerBroker` code) that if a blocking task is in queue, then there _is_ a waiter that would call `CompileBroker::wait_for_completion()` on it.
>>
>> Additional testing:
>> - [x] Linux AArch64 server fastdebug, `tier1`
>> - [x] Linux x86_64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
>
> Also handle the corner case when compiler threads might be using the task
@iwanowww please look?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26294#issuecomment-3090241741
More information about the hotspot-compiler-dev
mailing list