Native methods and virtual threads

Fri Jul 14 18:51:58 UTC 2023

Thank you for coming back.

Answering your question, I think it would be great to have a way to know
how many virtual threads are waiting for a platform thread in order to be
pinned, perhaps through ThreadMXBean or equivalent. That would be a metric
an application can monitor and take action if it goes out of the expected.

In relation to my initial proposition, I overlooked the
jdk.virtualThreadScheduler.maxPoolSize property, which could assist in my
scenario. However, it seems to be capped at 256 unless parallelism is also
set. I'm apprehensive about meddling with these internal ergonomics without
the requisite expertise, as it may lead me into a tricky situation. Sounds
like those kinds of properties you can change if you know what you're
doing, which unfortunately is not the case :)

--
@apr <http://twitter.com/apr>

On Fri, Jul 14, 2023 at 1:51 PM Ron Pressler <ron.pressler at oracle.com>
wrote:

> That’s something we actually struggled with (and reached neither unanimity
> nor strong confidence in yet). The problem with compensating by default is
> that it is unlikely to work for the actual bread-and-butter of virtual
> threads. If your common IO happens to occur when threads are pinned then at
> best you’ll quickly get an OOME, and at worst you’ll get strange
> performance behaviour that doesn’t make detecting a problem easier. And
> that’s the point: if your most common operations block OS threads then
> there’s nothing we can do to raise your throughput beyond what platform
> thread pools do. You have an incompatibility with virtual threads that you
> must resolve somehow.
>
> The question was, therefore, what would be the behaviour that would make
> detecting such problems easier? I don’t know if we have the right answer,
> nor am I certain that we won’t change it, but that was the question that
> guided us.
>
> — Ron
>
> > On 14 Jul 2023, at 17:31, Alejandro Revilla <apr at jpos.org> wrote:
> >
> > I'm tempted to chime-in with a comment/idea that I'm not sure if you
> have considered.
> >
> > This is not specific to Native methods, but somehow related to the
> problem you are addressing here.
> >
> > TL;DR go to the last paragraph.
> >
> > While working with Loom in our project (jPOS, payments related stuff),
> we immediately struggled with a few virtual threads consuming all available
> platform threads. This issue had us perplexed for a couple of days until we
> identified that Java Flight Recorder's `jdk.VirtualThreadPinned` could
> assist us and resolving the remaining code issues became significantly
> easier.
> >
> > Loom is an absolute game-changer for our specific needs allowing us to
> move away from manually crafted continuations and reactive programming (we
> deal with a large number of in-flight transactions, sometimes in the tens
> of thousands, usually waiting for remote issuers, HSMs, databases, etc.).
> >
> > While we are fixing all library related synchronization issues, a
> typical application has user code that may not behave as well, so what we
> have in mind is to detect if we are struggling in terms of response time so
> that we can beef-up our TransactionManager's sessions with more platform
> threads (instead of Virtual ones). In our case, when we receive a request,
> we handle it in a thread, doesn't matter if it's platform or real, so the
> idea is that once we get a request, if we find the system is "OK" (so to
> speak), we offload it to a VirtualThread, otherwise, to a platform one
> (raising alarms so that we can investigate why we had to create them).
> >
> > After this lengthy introduction (my apologies), I have a suggestion to
> make. During this initial transition period, as many libraries adapt to
> Loom, wouldn't it be beneficial for the pool of platform threads used by
> Loom to be optionally dynamic? This would allow it to expand to thousands
> of platform threads if needed, which would reassure early adopters that, in
> a worst-case scenario, the system would operate as it did in the past using
> platform threads.
> >
> > My 2c. Thank you for reading.
> >
> > --
> > @apr
> >
> > On Fri, Jul 14, 2023 at 12:23 PM Maurizio Cimadamore <
> maurizio.cimadamore at oracle.com> wrote:
> >
> > On 14/07/2023 16:19, Brian S O'Neill wrote:
> > > This sounds reasonable, but the current "wait and see" model is
> > > inconsistent. JEP 444 says, use JFR and we'll get back to you. Was
> > > this process followed when the decision was made to add the Blocker
> > > class, or is it a premature optimization that should be removed?
> > > Likewise, the FFM API has the isTrivial option which adds even more
> > > risks. What was the process for deciding that this optimization was
> > > necessary?
> >
> > I can speak to the latter, which was added to mitigate cases where users
> > migrate away from critical JNI to do low-latency native calls.
> >
> > (We might also look into hints to enable pinning of heap objects for
> > very same reasons).
> >
> > Maurizio
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20230714/41a66299/attachment.htm>