Native methods and virtual threads

Fri Jul 14 16:31:14 UTC 2023

I'm tempted to chime-in with a comment/idea that I'm not sure if you have
considered.

This is not specific to Native methods, but somehow related to the problem
you are addressing here.

TL;DR go to the last paragraph.

While working with Loom in our project (jPOS, payments related stuff), we
immediately struggled with a few virtual threads consuming all available
platform threads. This issue had us perplexed for a couple of days until we
identified that Java Flight Recorder's `jdk.VirtualThreadPinned` could
assist us and resolving the remaining code issues became significantly
easier.

Loom is an absolute game-changer for our specific needs allowing us to move
away from manually crafted continuations and reactive programming (we deal
with a large number of in-flight transactions, sometimes in the tens of
thousands, usually waiting for remote issuers, HSMs, databases, etc.).

While we are fixing all library related synchronization issues, a typical
application has user code that may not behave as well, so what we have in
mind is to detect if we are struggling in terms of response time so that we
can beef-up our TransactionManager's sessions with more platform threads
(instead of Virtual ones). In our case, when we receive a request, we
handle it in a thread, doesn't matter if it's platform or real, so the idea
is that once we get a request, if we find the system is "OK" (so to speak),
we offload it to a VirtualThread, otherwise, to a platform one (raising
alarms so that we can investigate why we had to create them).

After this lengthy introduction (my apologies), I have a suggestion to
make. During this initial transition period, as many libraries adapt to
Loom, wouldn't it be beneficial for the pool of platform threads used by
Loom to be optionally dynamic? This would allow it to expand to thousands
of platform threads if needed, which would reassure early adopters that, in
a worst-case scenario, the system would operate as it did in the past using
platform threads.

My 2c. Thank you for reading.

--
@apr <http://twitter.com/apr>

On Fri, Jul 14, 2023 at 12:23 PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

>
> On 14/07/2023 16:19, Brian S O'Neill wrote:
> > This sounds reasonable, but the current "wait and see" model is
> > inconsistent. JEP 444 says, use JFR and we'll get back to you. Was
> > this process followed when the decision was made to add the Blocker
> > class, or is it a premature optimization that should be removed?
> > Likewise, the FFM API has the isTrivial option which adds even more
> > risks. What was the process for deciding that this optimization was
> > necessary?
>
> I can speak to the latter, which was added to mitigate cases where users
> migrate away from critical JNI to do low-latency native calls.
>
> (We might also look into hints to enable pinning of heap objects for
> very same reasons).
>
> Maurizio
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20230714/039a67ff/attachment.htm>