"Memory leak" caused by WorkQueue#topLevelExec
Robert Engels
robaho at me.com
Sat Nov 29 22:10:53 UTC 2025
Hi. I am not really sure what you expect to happen here. When you fork/join the system needs to maintain a reference to the top-level task in order to determine when it has completed and thus the executor is available for the next. You see that in the ForkJoin JDK code you linked.
So, the easiest solution is to make the task payload clearable - you can’t clear the task but you can clear the heavy payload - allow it to be GC’d.
Or, make the running of the subtasks async without a reference to the heavy payload so the top-task can return and be garbage collected.
It sounds like a misuse of FJ - as if you are running your own FJ infra inside of ForkJoin - might as well take ForkJoin out and write a custom executor that doesn’t hold the task reference.
> On Nov 29, 2025, at 7:48 AM, Olivier Peyrusse <kineolyan at protonmail.com> wrote:
>
> Hello community,
>
> Sorry if this is the wrong place to discuss internal classes such as the ForkJoinPool. If so, please, excuse me and point me in the right direction.
>
> At my company, we have experienced an unfortunate memory leak because one of our CountedCompleter was retaining a large object and the task was not released to the GC (I will give more details below but will first focus on the FJP code causing the issue).
>
> When running tasks, the FJP ends up calling WorkQueue#topLevelExec <https://github.com/openjdk/jdk/blob/c419dda4e99c3b72fbee95b93159db2e23b994b6/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L1448-L1453>, which is implemented as follow:
>
> final void topLevelExec(ForkJoinTask<?> task, int fifo) {
> while (task != null) {
> task.doExec();
> task = nextLocalTask(fifo);
> }
> }
>
> We can see that it starts from a top-level task task, executes it, and looks for the next task to execute before repeating this loop. This means that, as long as we find a task through nextLocalTask, we do not exit this method and the caller of topLevelExec retains in its stack a reference to the first executed task - like here <https://github.com/openjdk/jdk/blob/c419dda4e99c3b72fbee95b93159db2e23b994b6/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L1992-L2019>. This acts as a path from the GC root, preventing the garbage collection of the task.
> So even if a CountedCompleter did complete its exec / tryComplete / etc, the framework will keep the object alive.
> Could the code be changed to avoid this issue? I am willing to do the work, as well as come up with a test case reproducing the issue if it is deemed needed.
>
> In our case, we were in the unfortunate situation where our counted completer was holding an element which happened to be a sort of head of a dynamic sort of linked queue. By retaining it, the rest of the growing linked queue was also retained in memory, leading to the memory leak.
> Obvious fixes are possible in our code, by ensuring that we nullify such elements when our operations complete, and more ideas. But this means that we have to be constantly careful about the fields we pass to the task, what is captured if we give lambdas, etc. If the whole ForkJoinPool could also be improved to avoid such problems, it would be an additional safety.
>
> Thank you for reading the mail
> Cheers
>
> Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20251129/52d96664/attachment-0001.htm>
More information about the loom-dev
mailing list