[External] : Re: Virtual thread memory leak because TrackingRootContainer keeps threads

Thu Jul 25 13:05:52 UTC 2024

This is what I’ve been trying to state albeit obviously not very well. 

A VT must be a defacto “root” or technically it can just disappear as soon as as it is created - if the creator doesn’t maintain a reference. This seems nonsensical to me.

I believe section 12.6.1of the language spec covers this. Since it states that an object is reachable if it is reachable from any live thread. Thread “liveness” is not specified based on reachability. Since the spec doesn’t include anything about Thread methods, a Thread must be “live” independent of its reachability (observability from Thread.getAllStackTraces()), otherwise as soon as a Thread is created it it would be unreachable, and any objects it created would be unreachable as well.

Since a virtual thread inherits from Thread, it must be a Thread, and thus should be governed by the same rules.

> On Jul 25, 2024, at 7:22 AM, Matthew Swift <matthew.swift at gmail.com> wrote:
> 
> 
> Hmm, I'm starting to think I may have fallen into the same trap here as Michal.
> 
> I've been using virtual threads similar to platform threads for performing IO tasks asynchronously.
> 
> Background: originally, I was using Executors#newVirtualThreadPerTaskExecutor to run these tasks, which was fine because the Executor tracks the virtual threads it creates in order to support shutdown. However, from an observability point of view I've found the executor to be a bit frustrating because it is impossible to set the thread name before the thread is scheduled to run[1]. This means that under heavy load where the FJ pool is busy a thread dump shows many unnamed threads, which are waiting to be scheduled for the first time. Admittedly, this is only a minor annoyance because I'm more interested in the threads that are clogging up the FJ pool than the ones which are waiting to use it, even so it'd be nice to have an overall picture of what's active and what's queued (note also thread names are included in JFR events, which is super helpful). 
> 
> To remedy this, I've switched away from using an Executor and now I just use "Thread.ofVirtual().name(initialName).start(task)". However, I don't think all of the tasks are strongly reachable - some are "fire and forget" tasks (e.g. async resource cleanup), so I may be inadvertently relying on the JVM's observability support to keep these tasks alive until they complete, which seems a bit brittle. In fact, now that I think of it, it may not even be limited to fire and forget tasks. A VT that is reading messages from a Socket could be GC'd IIUC since, I'm not maintaining references to the reader VT itself:
> 
> * application creates/accepts Socket
> * application creates reader VT referencing the socket and starts the thread but doesn't keep a reference to the thread
> * reader VT terminates when either client closes the socket (EOS) or when the application closes the socket triggering an exception in the reader VT.
> 
> In other words, there are strong references to the task and its state (e.g. Socket), but there is no strong reference to the task itself AFAICS, nor do I need one really, since the thread lifecycle is managed indirectly. I fear I may have a similar problem for VTs that are executing client requests - I maintain strong references to the task's state (e.g. request), but not the VT itself.
> 
> Am I misunderstanding?
> 
> Thanks :-)
> 
> [1] VTs make thread naming very lightweight, which is great for debugging. For example, I've included connection information, a summary of request parameters and even updated the thread name to include internal routing information, DB index names, records and keys (making it easy to identify contended DB keys).
> 
> On Tue, 23 Jul 2024 at 16:57, Ron Pressler <ron.pressler at oracle.com <mailto:ron.pressler at oracle.com>> wrote:
> 
> 
> > On 22 Jul 2024, at 22:51, Michal Domagala <outsider404 at gmail.com <mailto:outsider404 at gmail.com>> wrote:
> > 
> > Thanks for the explanation.
> > 
> > I understand that paragraph
> > 
> > "Unlike platform thread stacks, virtual thread stacks are not GC roots. Thus the references they contain are not traversed in a stop-the-world pause by garbage collectors, such as G1, that perform concurrent heap scanning"
> > 
> > can be rewritten as
> > 
> > "Some GC, such as G1, marks GC roots in stop-the-world pause. Unlike platform thread stacks, virtual thread stacks are not GC roots, therefore they do not impact stop-the-world pause."
> > 
> > In my opinion, the current paragraph in JEP 444 requires readers to have a deep GC background. Usually, developers are not aware of GC root cost (at least I was not aware). Developers could tune the number of GC roots by changing the number of platform threads. Others, like static variables, are rather not tunable.
> > But usually, the OS limits the number of platform threads much more strictly than GC performance.
> > 
> > To sum up, JEP 444's message is: "Do not be afraid of the G1 initial mark phase when using virtual threads". But I think most developers, like me, never heard about it. Ohers, more advanced, could also never care about it, because Oracle docs says about the initial mark phase: "This phase is piggybacked on a normal (STW) young garbage collection.". I understand this sentence as the phase is "for free".
> > 
> > To sum up again: when a developer like me reads that VT is not GC root, he does not see G1 profit behind. He reads: VT is GC'able. And the current state, when behavior is different, is misleading.
> 
> A virtual thread *is* GCable, just like a String, when it is not strongly referenced. However, by default virtual threads will have a strong reference for observability, but you can turn that off.
> 
> But, yes, the very notion of GC roots requires an advanced understanding of how the Java platform’s GCs work. I *think* that the very notion of GC roots is not part of the spec, but an implementation detail. No inference can be made, for any object, from whether or not it is a root, to when it will be collected. In fact, the platform’s GC make no guarantees as to when objects are collected, regardless of whether they’re roots or not. The only guarantee is that an object will not be collected if it is strongly reachable.
> 
> — Ron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240725/2c219fcd/attachment-0001.htm>