RFR: 8361752: Double free in CompileQueue::delete_all after JDK-8357473 [v2]

Aleksey Shipilev shade at openjdk.org
Tue Jul 15 08:59:17 UTC 2025


> See the bug for more analysis. 
> 
> The short summary is that `CompileQueue::delete_all` walks the entire compile queue and deletes the tasks. It normally goes smoothly, unless there are blocking tasks. Then, the actual waiters have to delete the task, lest we delete the task under waiter's feet. Full deletion and blocking waits coordinate with `waiting_for_completion_count` counter. This mechanism -- added by [JDK-8343938](https://bugs.openjdk.org/browse/JDK-8343938) in JDK 25 to solve a similar problem -- almost works. _Almost_.
> 
> There is a subtle race window, where blocking waiter could have already unparked, dropped `waiting_for_completion_count` to `0` and proceeded to delete the task, see `CompileBroker::wait_for_completion()`. Then the queue deletion code could assume there are _no actual waiters_ on the blocking task, and proceed to delete the task _again_. Before [JDK-8357473](https://bugs.openjdk.org/browse/JDK-8357473) this race was fairly innocuous, as second attempt at insertion into the free list was benign. But now, `CompileTask`-s are `delete`-d, and the second attempt leads to double free.
> 
> I suspect we can fix that by complicating the coordination protocol even further, e.g. by tracking the counters more thoroughly. But, recognizing `CompileQueue::delete_all()` is basically only called from the compiler shutdown code (things are already bad), and it looks completely opportunistic (it does not delete the whole compiler _threads_, so skipping synchronous deletes on a few compile tasks are not a big deal), we should strive to simplify it. 
> 
> This PR summarily delegates _all_ blocking task deletes to waiters. I think it stands to reason (and can be seen in `CompilerBroker` code) that if a blocking task is in queue, then there _is_ a waiter that would call `CompileBroker::wait_for_completion()` on it.
> 
> Additional testing:
>  - [x] Linux AArch64 server fastdebug, `tier1`
>  - [ ] Linux AArch64 server fastdebug, `all`

Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:

  Also handle the corner case when compiler threads might be using the task

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/26294/files
  - new: https://git.openjdk.org/jdk/pull/26294/files/13625998..76bfa8d1

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=26294&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26294&range=00-01

  Stats: 18 lines in 2 files changed: 7 ins; 2 del; 9 mod
  Patch: https://git.openjdk.org/jdk/pull/26294.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/26294/head:pull/26294

PR: https://git.openjdk.org/jdk/pull/26294


More information about the hotspot-compiler-dev mailing list