Performance Issues with Virtual Threads + ThreadLocal Caching in Third-Party Libraries (JDK 25)

Jianbin Chen jianbin at apache.org
Fri Jan 23 07:30:52 UTC 2026


Hi everyone,

After upgrading to JDK 25 and switching to virtual threads, I've run into a
performance issue that I think is worth discussing.

When using the non-pooled style:

```java
Executors.newThreadPerTaskExecutor(Thread.ofVirtual().name("vt-",
1).factory());
```

I observed a **significant performance degradation** in workloads that rely
on ThreadLocal-based buffer pools in third-party libraries. A concrete
example is the Aerospike Java client:

https://github.com/aerospike/aerospike-client-java/blob/master/client/src/com/aerospike/client/util/ThreadLocalData.java

Because each virtual thread gets its own ThreadLocal instance, and virtual
threads are created per task (no pooling), these libraries end up
allocating fresh multi-KB byte buffers repeatedly in each new virtual
thread. This causes:

- A massive surge in short-lived object allocations
- Dramatically increased GC pressure
- Noticeably higher CPU usage

Even though virtual threads are cheap, the GC overhead ends up eating most
of the benefits we expect from virtual threads.

However, when I switched to a **pooled virtual thread executor** like this:

```java
new ThreadPoolExecutor(
    200, // corePoolSize
    Integer.MAX_VALUE,
    60, TimeUnit.SECONDS,
    new SynchronousQueue<>(),
    Thread.ofVirtual().name("vt-pool-", 1).factory()
);
```

The performance improved **dramatically**:

- At least 200 virtual threads are reused → most ThreadLocal caches are hit
and reused
- GC pressure drops significantly
- CPU usage decreases accordingly

I ran benchmarks comparing:

1. Pure newThreadPerTaskExecutor (unpooled virtual threads)
2. The above ThreadPoolExecutor style (pooled virtual threads)

Even when I set `keepAliveTime = 0L` (so only the 200 core threads are kept
alive long-term and extras are immediately reclaimed), **the pooled version
still clearly outperformed the unpooled one**.

So my question is:

**In scenarios where third-party libraries heavily rely on ThreadLocal for
caching / buffering (and we cannot change those libraries to use object
pools instead), is explicitly pooling virtual threads (using a
ThreadPoolExecutor with virtual thread factory) considered a recommended /
acceptable workaround?**

Or are there better / more idiomatic ways to handle this kind of
compatibility issue with legacy ThreadLocal-based libraries when migrating
to virtual threads?

I have already opened a related discussion in the Dubbo project (since
Dubbo is one of the libraries affected in our stack):

https://github.com/apache/dubbo/issues/16042

Would love to hear your thoughts — especially from people who have
experience running large-scale virtual-thread-based services with mixed
third-party dependencies.

Thanks in advance!


Best Regards.
Jianbin Chen, github-id: funky-eyes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20260123/66556635/attachment.htm>


More information about the loom-dev mailing list