Scope Locals

Wed Mar 31 00:00:29 UTC 2021

On Mar 30, 2021, at 6:33 AM, Andrew Haley <aph at redhat.com> wrote:
> 
> In Project Loom, we've been looking at scope-local variables.
> 
> The idea is to support something like a "special variable" in
> Common Lisp. This is a dynamically-scoped variable which acquires a
> value on entry to a lexical scope, and when that scope terminates, its
> previous value (or none) is restored. We also intend to support thread
> inheritance for scope locals, so that parallel constructs can easily
> set a value in the outer scope before threads start.

This is very good.  I support this goal as a way to escape from
the corner we’ve painted ourselves into with ThreadLocal.

And I agree Common Lisp gives a good mental model, at
least as a starting point.  There are aspects of the CL model
I’d like to question carefully before importing.  The two
main ones are (a) surface syntax for variable access:
“naked” variable names are the access syntax for a CL
special-variable, but that’s not appropriate for Java,
because it impinges on the JLS which is managed very
conservatively.  Also (b) do we need special variables to
be side-effectible?  That imports costs; it would be better
if they were “effectively final”, with the usual provisions
for escaping that limitation explicitly (add an indirection
to a non-final variable).

> One useful way to think of scope locals is as invisible parameters
> which are passed to every method. However, we will only allow values to
> propagate from caller to called methods: assignments to a scope local
> variable in a called method will not be observed in the calling method,
> because when a scope terminates, scope locals set in that scope will
> have their previous values restored.

This isolation is true of both assignments and bindings.  I’d
rather outlaw assignments completely (see above).

> We'd like to implement this without any changes to the Language or
> Virtual Machine specifications, so a language-centric design like
> this isn't going to happen:
> 
>    scopeLocal MyType x;
> 
>    {
>        x = value;
>        ... code that eventually calls foo()
> 
>    }
> 
>    void foo() {
>        ... code that uses x ...
>    }
> 
> I've been kicking around various ideas and implementations of scope
> locals for some time, and I've now got something to share with the
> wider world, somewhat inspired by Haskell syntax, which looks
> something like.
> 
>    let x = expr1
>        y = expr2
>    in
>        ... expression that uses x, y ...
> 
> Translated into something a bit more Java-esque, that looks like
> 
>  // Declare to scope locals x and y
>  static final ScopeLocal<MyType> x = ... ;
>  static final ScopeLocal<MyType> y = ... ;
> 
>  {
>    ScopeLocal.set(x, expr1)
>              .set(y, expr2)
>              .in(() -> ... code that uses x and y ...)
>  }

Perhaps you meant to say “code that uses x.get() and y.get()”?

> I quite like the way this looks, and it allows for some interesting
> and useful variants. In particular, the result of the "binding
> expression" (a list of key/value pairs):
> 
>    ScopeLocal.set(x, expr1)
>              .set(y, expr2)
> 
> can be passed to threads, saved for later use, and so on.

This raises the question of separation of concerns:
What linkage *does* a scoping link (term here??) have to
threads, apart from the “in” call and the “get” call?

More fundamentally, what is the type returned by
ScopeLocal.set(…)?  Is it a ScopeLocal, or is it a scoping
link (type TBD), or are those one and the same?

Can I say:

var slink = ScopeLocal.set(x, xinitval);

and then call a non-static set on it:

slink = slink.set(y, yinitval);

Can I store that “slink” in data structure for a week
before accessing it in another thread, like this:

slink.set(z, zinitval).in(() -> …code that calls get on x,y,z…)

> We'll still
> enforce the rule that bindings cannot be *used* except inside the in()
> method.

OK, that means you can’t say this:

var slink = ScopeLocal.set(x, xinitval).set(y, yinitval);
var xval = slink.get(x);
var yval = slink.get(y);

But you could say this:

var xval = slink.in(x::get);

So is there any gain from disallowing the previous API
usages?

> There are some disadvantages to this approach, in particular that this
> method chaining can result in the creation of a bunch of temporary
> objects. Insisting on the use of a Lambda (a Runnable or a Callable)
> is, however, necessary in order strictly to enforce the property that
> a binding acquires a value on entry to a lexical scope, and when that
> scope terminates, its previous value (or none) is restored.

Yes; it’s more reliable than using a TWR construct, and
that means the JVM can optimize more “intensely”.

> I'm not entirely sure about the naming here. set() is given a
> different meaning from usual, and I'm not sure anyone would let me get
> away with using let(). in() is slightly odd too, and perhaps something
> more verbose would be better.
> 
> Other possibilities are e.g.
> 
>    ScopeLocal.bind(x, expr1)
>              .bind(y, expr2)
>              .exec(() -> ... code that uses x and y ...)

For bikeshed colors: I think the functional version of “setX”
can often be usefully called “withX”.  So it’s get/set for
mutable objects, but get/with for immutable ones.
Your scope-link object seems to be immutable in its
basic structure, although a “set” method could be
applied to side-effect leaves that are already present
due to a let/bind/with call if you disagree with my
point (b) above.  (As in Common Lisp, there are two
kinds of potential changes:  Making a new binding,
and changing the value of an existing binding.
That latter, if it is allowed, deserves a “set” name
rather than a “with” name.)

> Thank you for reading this far; comments and questions welcome.
> 
> -- 
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>