<div dir="ltr">In class loading/initialization cases, the underlying issue is that pinning happens because there are <span class="gmail-il">native</span> <span class="gmail-il">frames</span> in the stack, so replacing the synchronized with a j.u.c lock will still lead to the same issue.<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Jun 22, 2024 at 9:27 AM 何品(虎鸣) <<a href="mailto:hepin.p@alibaba-inc.com" target="_blank">hepin.p@alibaba-inc.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="clear:both">Is it possible make the classloading use the reentrantlock insteadOf <a href="https://docs.oracle.com/javase/specs/jls/se9/html/jls-12.html#jls-12.4:~:text=Synchronize%20on%20the%20initialization%20lock%2C%20LC%2C%20for%20C.%20This%20involves%20waiting%20until%20the%20current%20thread%20can%20acquire%20LC" style="font-family:Tahoma,Arial,STHeiti,SimSun;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;background-color:rgb(255,255,255)" target="_blank">Synchronize</a> in the JDK23 release? that will helps a lot. </div><div style="clear:both"><span><a href="https://docs.oracle.com/javase/specs/jls/se9/html/jls-12.html#jls-12.4:~:text=Synchronize%20on%20the%20initialization%20lock%2C%20LC%2C%20for%20C.%20This%20involves%20waiting%20until%20the%20current%20thread%20can%20acquire%20LC" target="_blank">https://docs.oracle.com/javase/specs/jls/se9/html/jls-12.html#jls-12.4:~:text=Synchronize%20on%20the%20initialization%20lock%2C%20LC%2C%20for%20C.%20This%20involves%20waiting%20until%20the%20current%20thread%20can%20acquire%20LC</a></span></div><blockquote style="margin-right:0px;margin-top:0px;margin-bottom:0px;font-family:Tahoma,Arial,STHeiti,SimSun;font-size:14px;color:rgb(0,0,0)"><div style="clear:both">------------------------------------------------------------------</div><div style="clear:both">发件人:何品(虎鸣) <<a href="mailto:hepin.p@alibaba-inc.com" target="_blank">hepin.p@alibaba-inc.com</a>></div><div style="clear:both">发送时间:2024年6月21日(星期五) 19:33</div><div style="clear:both">收件人:"何品(虎鸣)"<<a href="mailto:hepin.p@alibaba-inc.com" target="_blank">hepin.p@alibaba-inc.com</a>>; Alan Bateman<<a href="mailto:Alan.Bateman@oracle.com" target="_blank">Alan.Bateman@oracle.com</a>>; "loom-dev"<<a href="mailto:loom-dev@openjdk.org" target="_blank">loom-dev@openjdk.org</a>></div><div style="clear:both">主 题:回复:回复:Virtual thread hang and all threads stop running on JDK21</div><div style="clear:both"><br></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">hi team, we have found the root cause.</span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">the root cause is because the class loading of in both Netty's EventLoop and our taskRunner (virtualthread), and the threads are blocked when there are many virtual threads.</span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun"><br></span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">A fix of limit the number of virtualthreads and increase the parallisim > max virtual threads fix the problem.</span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun"><br></span></div><div style="clear:both">We successfully dump the jcmd when switch to openjdk, and thanks for the help.</div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun"><br></span></div><div><div style="clear:both">------------------------------------------------------------------</div><div style="clear:both">发件人:何品(虎鸣) <<a href="mailto:hepin.p@alibaba-inc.com" target="_blank">hepin.p@alibaba-inc.com</a>></div><div style="clear:both">发送时间:2024年6月5日(星期三) 18:18</div><div style="clear:both">收件人:Alan Bateman<<a href="mailto:Alan.Bateman@oracle.com" target="_blank">Alan.Bateman@oracle.com</a>>; "何品(虎鸣)"<<a href="mailto:hepin.p@alibaba-inc.com" target="_blank">hepin.p@alibaba-inc.com</a>>; "loom-dev"<<a href="mailto:loom-dev@openjdk.org" target="_blank">loom-dev@openjdk.org</a>></div><div style="clear:both">主 题:回复:回复:Virtual thread hang and all threads stop running on JDK21</div><div style="clear:both"><br></div><div style="clear:both">Yes, even jcmd is hung, we are currently want to switch back to the forkJoinPool and increase the <span>jdk.virtualThreadScheduler.maxPoolSize and </span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">jdk.virtualThreadScheduler.parallelism > our total max concurrency.</span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun"><br></span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">I have several system is using Virtual thread, but only this one cause problem. </span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun"><br></span></div><div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun"><br></span></div><div style="margin:14px 40px"><div style="clear:both">------------------------------------------------------------------</div><div style="clear:both">发件人:Alan Bateman <<a href="mailto:Alan.Bateman@oracle.com" target="_blank">Alan.Bateman@oracle.com</a>></div><div style="clear:both">发送时间:2024年6月5日(星期三) 18:00</div><div style="clear:both">收件人:"何品(虎鸣)"<<a href="mailto:hepin.p@alibaba-inc.com" target="_blank">hepin.p@alibaba-inc.com</a>>; "loom-dev"<<a href="mailto:loom-dev@openjdk.org" target="_blank">loom-dev@openjdk.org</a>></div><div style="clear:both">主 题:Re: 回复:Virtual thread hang and all threads stop running on JDK21</div><div style="clear:both"><br></div>
On 05/06/2024 10:37, 何品(虎鸣) wrote:<br>
<div>
<div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">Thanks, </span></div>
<div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">1. it hung
when we are sharing the common usage of the Virtualthread
(the default scheduler) cross two modules.</span></div>
<div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">2. after
that, I try to hack the virtual thread builder with
separated ThreadPoolExecutor.</span></div>
<div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">3. but
still , it hung.</span></div>
<div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun"><br>
</span></div>
<div style="clear:both"><span style="font-family:Tahoma,Arial,STHeiti,SimSun">when it
hung, `<span style="color:rgb(0,0,0);font-family:Helvetica,Tahoma,Arial;font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">jcmd
Thread.print` prints nothing. and dump programly print
nothing too.</span></span></div>
<div style="clear:both"><span style="font-family:Helvetica,Tahoma,Arial;color:rgb(0,0,0);font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline"><br>
</span></div>
<div style="clear:both"><span style="font-family:Helvetica,Tahoma,Arial;color:rgb(0,0,0);font-size:14px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;word-spacing:0px;white-space:normal;text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">After
switch back to normal thread, it never hang.</span></div>
<div style="clear:both"><span style="font-family:Helvetica,Tahoma,Arial">Some
information:<br>
module A is using `Object.notifyAll, Object.wait` and module
B is using `CompletableFuture.get` (may > 100 times in
one rune)</span></div>
<div style="clear:both"><span style="font-family:Helvetica,Tahoma,Arial">I was thinking
if that could be a problem of notification missing, where in
Module A, the concurrency is 3000, protected by a <span>semaphore,
but the underling Carrier Thread is only 128, and 3000
> 128. </span></span><br>
</div>
</div>
<br>
If it's using synchronized/Object.wait then this may be related to
pinning. When there is both object monitors and j.u.concurrent locks
in play then it's possible to create deadlock scenarios due to
starvation, or selecting a successor or thread to wakeup and the
thread can't continue because there are no carriers available.
Object.wait will temporarily increase parallelism to smooth and help
some cases but it may not help you here, and does nothing when the
scheduler has been changed to be something other than a ForkJoinPool
instance.<br>
<br>
When you say "prints nothing" then you mean this literally or do you
mean that jcmd is hung too? If so that's a hint that may be the lock
for standard output is held by a virtual thread that can't continue
because there are no carriers available.<br>
<br>
It would be interesting to try the latest Loom EA builds which has
changes to the object monitor implementation that doesn't pin. Would
you have time to try these builds out?<br>
<br>
-Alan<br>
<br>
[1] <a href="https://jdk.java.net/loom/" target="_blank">https://jdk.java.net/loom/</a><br>
</div></div></blockquote></div></blockquote></div>