RFR: 8361752: Double free in CompileQueue::delete_all after JDK-8357473

Aleksey Shipilev shade at openjdk.org
Mon Jul 14 17:31:40 UTC 2025


On Mon, 14 Jul 2025 17:16:07 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> See the bug for more analysis. 
>> 
>> The short summary is that `CompileQueue::delete_all` walks the entire compile queue and deletes the tasks. It normally goes smoothly, unless there are blocking tasks. Then, the actual waiters have to delete the task, lest we delete the task under waiter's feet. Full deletion and blocking waits coordinate with `waiting_for_completion_count` counter. This mechanism -- added by [JDK-8343938](https://bugs.openjdk.org/browse/JDK-8343938) in JDK 25 to solve a similar problem -- almost works. _Almost_.
>> 
>> There is a subtle race window, where blocking waiter could have already unparked, dropped `waiting_for_completion_count` to `0` and proceeded to delete the task, see `CompileBroker::wait_for_completion()`. Then the queue deletion code could assume there are _no actual waiters_ on the blocking task, and proceed to delete the task _again_. Before [JDK-8357473](https://bugs.openjdk.org/browse/JDK-8357473) this race was fairly innocuous, as second attempt at insertion into the free list was benign. But now, `CompileTask`-s are `delete`-d, and the second attempt leads to double free.
>> 
>> I suspect we can fix that by complicating the coordination protocol even further, e.g. by tracking the counters more thoroughly. But, recognizing `CompileQueue::delete_all()` is basically only called from the compiler shutdown code (things are already bad), and it looks completely opportunistic (it does not delete the whole compiler _threads_, so skipping synchronous deletes on a few compile tasks are not a big deal), we should strive to simplify it. 
>> 
>> This PR summarily delegates _all_ blocking task deletes to waiters. I think it stands to reason (and can be seen in `CompilerBroker` code) that if a blocking task is in queue, then there _is_ a waiter that would call `CompileBroker::wait_for_completion()` on it.
>> 
>> Additional testing:
>>  - [x] Linux AArch64 server fastdebug, `tier1`
>>  - [ ] Linux AArch64 server fastdebug, `all`
>
> src/hotspot/share/compiler/compileBroker.cpp line 394:
> 
>> 392:     ml.notify_all();
>> 393:   }
>> 394: 
> 
> What about other compiler threads which still in process of compiling for blocking tasks? They still need it CompileTask object.
> `delete_all()` is called by one compiler thread which finished compilation but other threads may not.
> 
> I don't see any compiler thread checks `shut_down` state to stop compilation.

AFAIU, that's the point of the existing protocol to force _waiters_ to delete the task: the blocking waiter would wait for compiler thread to complete the task one way or the other. This PR makes that protocol even stronger: _only_ blocking waiters are allowed to delete the blocking task.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26294#discussion_r2205449314


More information about the hotspot-compiler-dev mailing list