carrier local

Fri Apr 3 19:45:57 UTC 2020

On Apr 3, 2020, at 11:44 AM, Dmitry Zaslavsky <dmitry.zaslavsky at gmail.com> wrote:
> 
> I see how people can shoot themselves in the foot. 
> But you would have another class something CarrierThreadLocal<>

It’s not just foot-shooting that’s an issue here, it’s reasonable
abstraction over carrier threads.  Being able to have millions
of lightweight threads means that JVM has to be free to
make the details of carrier threads irrelevant to most layers
of code, including nearly all user code.  When the scheduler
and platform APIs conspire to reschedule a lightweight thread
onto a new carrier thread, that has to be invisible to >99% of
the code in the system, or else you start to complicate the
scheduler’s decision making to the point of making everybody
go slower.

The contract of CTL is not just a clone of TL with a warning,
“hey folks, remember, no foot-shooting!”.  It has to be very
limited in its guarantees, compared to TL, because from a user
code perspective, the carrier thread can change between almost
any two operations.  (Why?  Because we don’t color methods
as “CT-switching” and “CT-safe”, like async. systems based
on static compilation do.)  The contract of CTL has to be,
IMO, transactional, in the sense that every operation on a
CTL must happen in a context where the identity of the
CT is held stable, somehow, relative to any request that
another virtual thread might make on the same CT.

You might think, “OK so use lambdas like this”:

class CTL<T> {
   <R> withValue(Function<T,R> fn);
}

Then it’s clear that passing the T value outside of the call
to fn is foot-shooting.  But there’s a new kind of foot-shooting
here, which is that during the call to fn (which might take a
long time and block) the CT has to stay bound to the virtual
thread that is running the call to fn, *just to uphold the contract
of the CTL and for no other reason*.  This is a recipe for gumming
up the scheduler by pinning many CTs to well-meaning calls which
wants to access for their own CTLs “just for a second, and then
I’ll let you guys continue”.  The risk here is that user code which uses
CTLs in an over-eager way (“hey look, I can save a buffer allocation
in my LDT library!!!”) can shoot *everybody* in the foot, by
bogging down the scheduler.  The root cause would be a bad
contract that would cost all users of the virtual thread scheduler.

To avoid this, we need to avoid re-creating the very same conditions
we are trying to escape, which is the demand to pin resources and
invariants to dinosaur threads.

My advice:  Think in terms of O(1) guaranteed non-blocking operations
which a CTL can offer to “check out” and “check in” resources that are
tied to a CT.  I think that’s the best you can get.  No general lambdas,
and no other “hook” (like an interface or factory) for user code to gum
up the scheduler.  Just platform-defined O(1) state changes, like acquire
+ release, or atomic increment.

— John