"Memory leak" caused by WorkQueue#topLevelExec

Sat Dec 6 15:38:55 UTC 2025

On 12/6/25 09:30, Olivier Peyrusse wrote:
> Hello community,
>
> Sorry if this is the wrong place to discuss internal classes such as 
> the ForkJoinPool. If so, please, excuse me and point me in the right 
> direction.
>
> At my company, we have experienced an unfortunate memory leak because 
> one of our CountedCompleter was retaining a large object and the task 
> was not released to the GC (I will give more details below but will 
> first focus on the FJP code causing the issue).
>
> When running tasks, the FJP ends up calling WorkQueue#topLevelExec 
> <https://github.com/openjdk/jdk/blob/c419dda4e99c3b72fbee95b93159db2e23b994b6/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L1448-L1453>, 
> which is implemented as follow:
>
>       final void topLevelExec(ForkJoinTask<?> task, int fifo) {
>           while (task != null) {
>               task.doExec();
>               task = nextLocalTask(fifo);
>           }
>       }
>
> We can see that it starts from a top-level task |task|, executes it, 
> and looks for the next task to execute before repeating this loop. 
> This means that, as long as we find a task through 
> |nextLocalTask|||, we do not exit this method and the caller of 
> |topLevelExec| retains in its stack a reference to the first executed 
> task - like here 
> <https://github.com/openjdk/jdk/blob/c419dda4e99c3b72fbee95b93159db2e23b994b6/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L1992-L2019>. 
> This acts as a path from the GC root, preventing the garbage 
> collection of the task.

The issue is not in that code, but the calling sequence: A ref is 
retained mainly for the sake of a stack trace. The only way to (only 
sometimes) avoid this would be to manually inline the method, which 
leads to different compilation/execution issues, which leads to other 
tradeoffs impacting other usages. But it's worth considering. Thanks for 
the report.

-Doug

> So even if a CountedCompleter did complete its exec / tryComplete / 
> etc, the framework will keep the object alive.
> Could the code be changed to avoid this issue? I am willing to do the 
> work, as well as come up with a test case reproducing the issue if it 
> is deemed needed.
>
> In our case, we were in the unfortunate situation where our counted 
> completer was holding an element which happened to be a sort of head 
> of a dynamic sort of linked queue. By retaining it, the rest of the 
> growing linked queue was also retained in memory, leading to the 
> memory leak.
> Obvious fixes are possible in our code, by ensuring that we nullify 
> such elements when our operations complete, and more ideas. But this 
> means that we have to be constantly careful about the fields we pass 
> to the task, what is captured if we give lambdas, etc. If the whole 
> ForkJoinPool could also be improved to avoid such problems, it would 
> be an additional safety.
>
> Thank you for reading the mail
> Cheers
>
> Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/concurrency-discuss/attachments/20251206/ceeb610e/attachment.htm>