回复:回复:Virtual thread hang and all threads stop running on JDK21

何品(虎鸣) hepin.p at alibaba-inc.com
Wed Jun 5 10:18:19 UTC 2024


Yes, even jcmd is hung, we are currently want to switch back to the forkJoinPool and increase the jdk.virtualThreadScheduler.maxPoolSize and 
jdk.virtualThreadScheduler.parallelism > our total max concurrency.
I have several system is using Virtual thread, but only this one cause problem. 
------------------------------------------------------------------
发件人:Alan Bateman <Alan.Bateman at oracle.com>
发送时间:2024年6月5日(星期三) 18:00
收件人:"何品(虎鸣)"<hepin.p at alibaba-inc.com>; "loom-dev"<loom-dev at openjdk.org>
主 题:Re: 回复:Virtual thread hang and all threads stop running on JDK21
 On 05/06/2024 10:37, 何品(虎鸣) wrote:
Thanks, 
1. it hung when we are sharing the common usage of the Virtualthread (the default scheduler) cross two modules.
2. after that, I try to hack the virtual thread builder with separated ThreadPoolExecutor.
3. but still , it hung.
when it hung, `jcmd Thread.print` prints nothing. and dump programly print nothing too.
After switch back to normal thread, it never hang.
Some information:
 module A is using `Object.notifyAll, Object.wait` and module B is using `CompletableFuture.get` (may > 100 times in one rune)
I was thinking if that could be a problem of notification missing, where in Module A, the concurrency is 3000, protected by a semaphore, but the underling Carrier Thread is only 128, and 3000 > 128. 
 If it's using synchronized/Object.wait then this may be related to pinning. When there is both object monitors and j.u.concurrent locks in play then it's possible to create deadlock scenarios due to starvation, or selecting a successor or thread to wakeup and the thread can't continue because there are no carriers available. Object.wait will temporarily increase parallelism to smooth and help some cases but it may not help you here, and does nothing when the scheduler has been changed to be something other than a ForkJoinPool instance.
 When you say "prints nothing" then you mean this literally or do you mean that jcmd is hung too? If so that's a hint that may be the lock for standard output is held by a virtual thread that can't continue because there are no carriers available.
 It would be interesting to try the latest Loom EA builds which has changes to the object monitor implementation that doesn't pin. Would you have time to try these builds out?
 -Alan
 [1] https://jdk.java.net/loom/ <https://jdk.java.net/loom/ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240605/77a9e819/attachment.htm>


More information about the loom-dev mailing list