Concurrency issue with virtual threads
Сергей Цыпанов
sergei.tsypanov at yandex.ru
Sun Jul 28 12:59:48 UTC 2024
Hello,
I've run into a concurrency issue manifested in application becoming frozen due to pinned virtual threads.
Here I've described the reproduction steps in details: https://stackoverflow.com/questions/78790376/spring-boot-application-gets-stuck-when-virtual-threads-are-used-on-java-21
The problem is manifested when you run the Spring Boot application having virtual threads enabled. Under the hood my demo application has a feign client using connection pool with up to 20 connections (threshold cannot be increased due to a configuration bug), and as soon as you try to make more than 20 simultaneous request (even 21), the pool gets exhausted, meaning that upcoming requests have to wait for a connection released, and the application gets stuck (though it doesn't when platform threads are used i.e. when configuration property spring.threads.virtual.enabled is false).
Running the code with -Djdk.tracePinnedThreads=full I've identified the cause more precisely: it is located within AbstractConnPool.getPoolEntryBlocking(). Here's the link to the pinned threads stack trace: https://github.com/stsypanov/concurrency-demo/blob/master/pinned-threads.txt
In the file pay attention to these lines:
12 org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:319)
92 org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:391)
Now let's examine the source code of o.a.h.p.AbstractConnPool.getPoolEntryBlocking over here: https://github.com/apache/httpcomponents-core/blob/4.4.x/httpcore/src/main/java/org/apache/http/pool/AbstractConnPool.java
In this class we have a ReentrantLock (encouraged to be used with virtual threads instead of synchronized blocks) and its Condition:
private final Lock lock; private final Condition condition;
public AbstractConnPool() {
this.lock = new ReentrantLock();
this.condition = this.lock.newCondition();
}
Later in method AbstractConnPool.getPoolEntryBlocking() we have this logic:
private E getPoolEntryBlocking() {
this.lock.lock(); // line 319
try {
for (;;) {
try {
if (deadline != null) {
success = this.condition.awaitUntil(deadline);
} else {
this.condition.await(); // line 391
success = true;
}
}
}
} finally {
this.lock.unlock();
}
}
This code works with platform threads but gets stuck with virtual ones. If one gets thread dump of the stuck application there'll be 20 workers in ForkJoinPool and each will have the same stack trace (with different ids, of course):
"ForkJoinPool-1-worker-1" prio=0 tid=0x0 nid=0x0 waiting on condition
java.lang.Thread.State: WAITING
on java.lang.VirtualThread at 121c8328 owned by "tomcat-handler-123" Id=214
at java.base at 22.0.2/jdk.internal.vm.Continuation.run(Continuation.java:248)
at java.base at 22.0.2/java.lang.VirtualThread.runContinuation(VirtualThread.java:245)
at java.base at 22.0.2/java.lang.VirtualThread$$Lambda/0x000001579b475d08.run(Unknown Source)
at java.base at 22.0.2/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(ForkJoinTask.java:1726)
at java.base at 22.0.2/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(ForkJoinTask.java:1717)
at java.base at 22.0.2/java.util.concurrent.ForkJoinTask$InterruptibleTask.exec(ForkJoinTask.java:1641)
at java.base at 22.0.2/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:507)
at java.base at 22.0.2/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1489)
at java.base at 22.0.2/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:2071)
at java.base at 22.0.2/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2033)
at java.base at 22.0.2/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
As you see from the code above, the issue is still there on Java 22, also it's reproducible with other distributions of JDK (e.g. Liberica JDK).
I think this is a bug somewhere in the JVM, as the ending point of the stacktrace is native Continuation.enterSpecial(), otherwise the behavior would be the same regardless of platform or virtual threads.
Regards,
Sergey Tsypanov
More information about the loom-dev
mailing list