"Memory leak" caused by WorkQueue#topLevelExec
Doug Lea
dl at cs.oswego.edu
Sat Dec 6 15:38:55 UTC 2025
On 12/6/25 09:30, Olivier Peyrusse wrote:
> Hello community,
>
> Sorry if this is the wrong place to discuss internal classes such as
> the ForkJoinPool. If so, please, excuse me and point me in the right
> direction.
>
> At my company, we have experienced an unfortunate memory leak because
> one of our CountedCompleter was retaining a large object and the task
> was not released to the GC (I will give more details below but will
> first focus on the FJP code causing the issue).
>
> When running tasks, the FJP ends up calling WorkQueue#topLevelExec
> <https://github.com/openjdk/jdk/blob/c419dda4e99c3b72fbee95b93159db2e23b994b6/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L1448-L1453>,
> which is implemented as follow:
>
> final void topLevelExec(ForkJoinTask<?> task, int fifo) {
> while (task != null) {
> task.doExec();
> task = nextLocalTask(fifo);
> }
> }
>
> We can see that it starts from a top-level task |task|, executes it,
> and looks for the next task to execute before repeating this loop.
> This means that, as long as we find a task through
> |nextLocalTask|||, we do not exit this method and the caller of
> |topLevelExec| retains in its stack a reference to the first executed
> task - like here
> <https://github.com/openjdk/jdk/blob/c419dda4e99c3b72fbee95b93159db2e23b994b6/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L1992-L2019>.
> This acts as a path from the GC root, preventing the garbage
> collection of the task.
The issue is not in that code, but the calling sequence: A ref is
retained mainly for the sake of a stack trace. The only way to (only
sometimes) avoid this would be to manually inline the method, which
leads to different compilation/execution issues, which leads to other
tradeoffs impacting other usages. But it's worth considering. Thanks for
the report.
-Doug
> So even if a CountedCompleter did complete its exec / tryComplete /
> etc, the framework will keep the object alive.
> Could the code be changed to avoid this issue? I am willing to do the
> work, as well as come up with a test case reproducing the issue if it
> is deemed needed.
>
> In our case, we were in the unfortunate situation where our counted
> completer was holding an element which happened to be a sort of head
> of a dynamic sort of linked queue. By retaining it, the rest of the
> growing linked queue was also retained in memory, leading to the
> memory leak.
> Obvious fixes are possible in our code, by ensuring that we nullify
> such elements when our operations complete, and more ideas. But this
> means that we have to be constantly careful about the fields we pass
> to the task, what is captured if we give lambdas, etc. If the whole
> ForkJoinPool could also be improved to avoid such problems, it would
> be an additional safety.
>
> Thank you for reading the mail
> Cheers
>
> Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/concurrency-discuss/attachments/20251206/ceeb610e/attachment.htm>
More information about the concurrency-discuss
mailing list