EA builds with changes to object monitor implementation to avoid pinning with virtual threads
Alan Bateman
Alan.Bateman at oracle.com
Sat Feb 17 10:20:00 UTC 2024
On 16/02/2024 19:58, masoud parvari wrote:
> Hi Alan,
>
> About deadlock on Java 21 while serving static contents (which is
> resolved on your build), I deep dived a bit. You are right. The
> culprit is most probably /*not File I/O*/. What /*Spring-MVC*/ does is
> that it /*caches*/ from which location (out of multiple available
> candidates) it eventually manages to resolve the static resource and
> then it proceeds to do /*Classloader.getResourceAsStream()* /to get
> the file. The cache implementation is backed by /*ConcurrentHashMap*/
> and it calls */put(k,v)/ *method on the map which involves
> /*synchronized block.* /I just didn't understand how it can happen
> even with very few concurrent requests.
>
> Thanks for instructing me to use /*jcmd*/ and yes it's a /*12 core*/
> machine. I ran the test again and got 2 thread dumps. One from
> /*jvisualvm*/ and one from /*jcmd*/ so you can co-relate them. Please
> find them attached.
> It's a deadlock on classloader. 11 out of 12 carrier threads are block
> on a /synchronised*block*/ at
> /*java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:651)*
> /and the other one which is
> /*virtual thread #120 (forkjoinpool-1-worker-14) */, is stuck in a
> /synchronized**block/ at
> *java.base/java.util.zip.ZipFile.getEntry(ZipFile.java:339).*
>
Thanks for sharing the thread dumps. We can see 300 virtual threads. 12
are blocked trying to enter a monitor but are pinned due to native
frames on the stack. No other threads can run as a result. You won't see
these native frames in the stack traces but essentially all 12 are in
nl.trifork.qti.model.processing.expression.general.BaseValue's
constructor and triggering a class load, which goes through the VM. Of
the 12, 11 are blocked at BuiltinClassLoader.loadClassOrNull as you
pointed out. The built-in class loaders are "parallel capable" but they
do contend when several threads are attempting to load the same class at
the same time. As you found, one of the 12, #120 has got further but it
blocks as a later point due to other threads (#119 and #123) trying to
locate resources in the same JAR file. I think we can assume that one of
these two has been unblocked, meaning scheduled to continue, but can't
continue as there are no carriers available. If you run `jcmd <pid>
Thread.vthread_scheduler` a few times when this happens then you'll see
the counters stall.
I agree this is unfortunate, and not easy to avoid. It's essentially a
burst of virtual threads at startup with a mix of class loading (which
comes with pinning) and resource loading from the same JAR files. Right
now, the focus is the pain point of object monitors but class loading is
something that does need attention too.
-Alan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240217/d18bc270/attachment.htm>
More information about the loom-dev
mailing list