"Memory leak" caused by WorkQueue#topLevelExec
David Holmes
david.holmes at oracle.com
Sun Nov 30 21:27:13 UTC 2025
On 29/11/2025 11:48 pm, Olivier Peyrusse wrote:
> Hello community,
>
> Sorry if this is the wrong place to discuss internal classes such as the
> ForkJoinPool. If so, please, excuse me and point me in the right direction.
This is better suited to concurrency-discuss at openjdk.org.
David
> At my company, we have experienced an unfortunate memory leak because
> one of our CountedCompleter was retaining a large object and the task
> was not released to the GC (I will give more details below but will
> first focus on the FJP code causing the issue).
>
> When running tasks, the FJP ends up calling WorkQueue#topLevelExec
> <https://github.com/openjdk/jdk/blob/
> c419dda4e99c3b72fbee95b93159db2e23b994b6/src/java.base/share/classes/
> java/util/concurrent/ForkJoinPool.java#L1448-L1453>, which is
> implemented as follow:
>
> final void topLevelExec(ForkJoinTask<?> task, int fifo) {
> while (task != null) {
> task.doExec();
> task = nextLocalTask(fifo);
> }
> }
>
> We can see that it starts from a top-level task |task|, executes it,
> and looks for the next task to execute before repeating this loop. This
> means that, as long as we find a task through |nextLocalTask|||, we do
> not exit this method and the caller of |topLevelExec| retains in its
> stack a reference to the first executed task - like here <https://
> github.com/openjdk/jdk/blob/c419dda4e99c3b72fbee95b93159db2e23b994b6/
> src/java.base/share/classes/java/util/concurrent/
> ForkJoinPool.java#L1992-L2019>. This acts as a path from the GC root,
> preventing the garbage collection of the task.
> So even if a CountedCompleter did complete its exec / tryComplete / etc,
> the framework will keep the object alive.
> Could the code be changed to avoid this issue? I am willing to do the
> work, as well as come up with a test case reproducing the issue if it is
> deemed needed.
>
> In our case, we were in the unfortunate situation where our counted
> completer was holding an element which happened to be a sort of head of
> a dynamic sort of linked queue. By retaining it, the rest of the growing
> linked queue was also retained in memory, leading to the memory leak.
> Obvious fixes are possible in our code, by ensuring that we nullify such
> elements when our operations complete, and more ideas. But this means
> that we have to be constantly careful about the fields we pass to the
> task, what is captured if we give lambdas, etc. If the whole
> ForkJoinPool could also be improved to avoid such problems, it would be
> an additional safety.
>
> Thank you for reading the mail
> Cheers
>
> Olivier
loom-dev is about development
More information about the loom-dev
mailing list