Jetty and Loom

Tue Jan 5 08:23:19 UTC 2021

Alan,

On Mon, 4 Jan 2021 at 20:44, Alan Bateman <Alan.Bateman at oracle.com> wrote:

> I haven't seen any memes go by on these topics.
>

Meme as in "as a concept, belief, or practice, that spreads from person to
person" , not as in pictures with text.  I'm seeing contagion in the wild
of thoughts like "X is no longer a problem because Loom", which I believe
have been encouraged by statements like "just spawn another" and "forget
thread pools".   I see these over claims by Loom as harmful, as it makes
people like me want to test those claims, and thus the conversation becomes
about what Loom can't do rather than about what it can do.   It encourages
a focus on the negative, when there are lots of positives that could be the
focus if current limitations were more clearly acknowledged.

Perhaps these claims may ultimately prove to be true, but even if so, it is
some time until widespread availability.  Better to under promise and over
deliver than over promise and under deliver.

There have been a few examples with people trying out the builds that
> created thread pools of virtual threads, usually by replacing a
> ThreadFactory for a fixed thread pool when they meant to replace the thread
> pool.
>

I agree that pooling virtual threads is often not going to be a sensible
thing to do, but perhaps for different reasons.

Sure a semaphore can be used to limit concurrency of virtual threads, but
if that limit is below the systems capacity for kernel threads, then just
use a pool of kernel threads rather than a pool of virtual threads so you
get other advantages:

   - Maximum stack usage is reallocated, so less GC and OOM issues
   - ThreadLocals work as lazy resource pools
   - No cost of a second layer of scheduling
   - CPU bound tasks can be handled

If however, the limit on concurrency is very large, then you can't use an
Executor that contains a thread pool of kernel threads, but you can use one
that uses a semaphore to limit the concurrency of an infinite supply of
virtual threads, but you have to be aware of the differences:

   - you still do have a maximal bound of stack usage, it is large and
   dynamically allocated from the heap, and thus probably needs to be
   explicitly tested and GC tuned.
   - Using ThreadLocal as lazy resource pools  wont work, but is a poor
   choice anyway with large concurrency.  ThreadLocals will work fine for
   passing calling context... in fact they may be simpler to use than for
   async APIs.
   - There is a cost of second layer of scheduling
   - can't handle CPU bound tasks

So I come back to, if you need to limit concurrency to 100s or 1000s, or
you need threadlocals as a resource cache or you have CPU bound tasks,
then pooling kernel threads is a still valid technique and should not be
forgotten.

[aside to Ron - I'm not saying an Executor that limits concurrency is a
Thread Pool, I'm saying that a Thread pool is one possible implementation
of an Executor that limits concurrency.  Furthermore that there are reasons
other than start time that may favour pooling implementations over
alternatives]

> Virtual threads can run existing code but I would be concerned with the
> memory footprint if they are just used to run a lot of existing bloat. I'm
> looking at stack traces every day too and they are deeper that many other
> languages/runtimes but a 1000+ stack traces seems a bit excessive when the
> work to do is relatively simple. Compiled frames are very efficient and
> virtual threads would be much happier with thread stacks that are a few KB.
>

I agree that 1000+ stacks are excessive and the blog also shows how good
the JVM is at optimising stack frames.    However, I picked 1000 as approx
25% of the default capacity of allocated kernel thread stacks.     If your
stacks never approach 1000+, then you are only using < 25% of the
preallocated stack space and that is something that can be tuned to give
more capacity for kernel threads..... just as heap and GC can be tuned to
give more capacity for dynamic stacks.

The point is not the absolute size of the stack, but that both thread types
still need similar real space for real stacks, which can be the limiting
factor to the number of threads available.  Preallocated stacks have
pros/cons as do dynamic stacks.    I don't think either is better, they are
just different.

There has been some exploration and prototypes of cancellation, including
> exploring how both mechanisms can co-exist but we haven't been happy with
> it. It's a topic that the project will come back to.
>

Yet a recent post
<https://www.javaadvent.com/2020/12/project-loom-and-structured-concurrency.html>
presents deadlines and cancellation as a fait-accompli feature of Loom.
 I think that is more over promising.    Perhaps something about virtual
threads makes this goal more achievable, but until it has been achieved it
is a mistake to claim it as a feature.

  For now, Object.wait works with the FJP managedBlocker mechanism to
> increase parallelism during the wait. So the pool 16 carrier threads will
> increase when executing code where Object.wait is used.

Ah that is interesting to know.  I had written a test
<https://github.com/webtide/loom-trial/blob/main/src/main/java/org/webtide/loom/CPUBound.java>
that confirmed virtual threads are deferred by CPU bound tasks, but I had
not checked synchronised.

So Loom can spawn new kernel Theads if Object.wait is used!   What is the
limit of that?   In a pathological case of an application doing lots of
Object.waiting, could it eventually end up 1:1 kernel:virtual threads?   If
so, then I guess there is an issue of many spawned virtual threads suddenly
needing too many kernel threads?  I can't see how either OOM or deadlock
can be avoided generally?       Sure apps can be rewritten, but Jetty is
not in control of the applications deployed on it.

> The goal is make it possible to write code that scales as well as async
> code does but without the the complexity of async. The intention is that
> debuggers, profilers, and everything else just works as people expect.
>

My sense of Loom at the moment is that goal is not that far off that goal
for a significant range of applications....    I just don't yet see it
close to a totally general solution for all applications and containers.

Sorry to again focus on the negatives.   I hope our work in Jetty will
focus on the positives.

cheers

-- 
Greg Wilkins <gregw at webtide.com> CTO http://webtide.com