Performance Issues with Virtual Threads + ThreadLocal Caching in Third-Party Libraries (JDK 25)
Jianbin Chen
jianbin at apache.org
Fri Jan 23 07:30:52 UTC 2026
Hi everyone,
After upgrading to JDK 25 and switching to virtual threads, I've run into a
performance issue that I think is worth discussing.
When using the non-pooled style:
```java
Executors.newThreadPerTaskExecutor(Thread.ofVirtual().name("vt-",
1).factory());
```
I observed a **significant performance degradation** in workloads that rely
on ThreadLocal-based buffer pools in third-party libraries. A concrete
example is the Aerospike Java client:
https://github.com/aerospike/aerospike-client-java/blob/master/client/src/com/aerospike/client/util/ThreadLocalData.java
Because each virtual thread gets its own ThreadLocal instance, and virtual
threads are created per task (no pooling), these libraries end up
allocating fresh multi-KB byte buffers repeatedly in each new virtual
thread. This causes:
- A massive surge in short-lived object allocations
- Dramatically increased GC pressure
- Noticeably higher CPU usage
Even though virtual threads are cheap, the GC overhead ends up eating most
of the benefits we expect from virtual threads.
However, when I switched to a **pooled virtual thread executor** like this:
```java
new ThreadPoolExecutor(
200, // corePoolSize
Integer.MAX_VALUE,
60, TimeUnit.SECONDS,
new SynchronousQueue<>(),
Thread.ofVirtual().name("vt-pool-", 1).factory()
);
```
The performance improved **dramatically**:
- At least 200 virtual threads are reused → most ThreadLocal caches are hit
and reused
- GC pressure drops significantly
- CPU usage decreases accordingly
I ran benchmarks comparing:
1. Pure newThreadPerTaskExecutor (unpooled virtual threads)
2. The above ThreadPoolExecutor style (pooled virtual threads)
Even when I set `keepAliveTime = 0L` (so only the 200 core threads are kept
alive long-term and extras are immediately reclaimed), **the pooled version
still clearly outperformed the unpooled one**.
So my question is:
**In scenarios where third-party libraries heavily rely on ThreadLocal for
caching / buffering (and we cannot change those libraries to use object
pools instead), is explicitly pooling virtual threads (using a
ThreadPoolExecutor with virtual thread factory) considered a recommended /
acceptable workaround?**
Or are there better / more idiomatic ways to handle this kind of
compatibility issue with legacy ThreadLocal-based libraries when migrating
to virtual threads?
I have already opened a related discussion in the Dubbo project (since
Dubbo is one of the libraries affected in our stack):
https://github.com/apache/dubbo/issues/16042
Would love to hear your thoughts — especially from people who have
experience running large-scale virtual-thread-based services with mixed
third-party dependencies.
Thanks in advance!
Best Regards.
Jianbin Chen, github-id: funky-eyes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20260123/66556635/attachment.htm>
More information about the loom-dev
mailing list