Scope Locals

Tue Mar 30 13:33:56 UTC 2021

In Project Loom, we've been looking at scope-local variables.

The idea is to support something like a "special variable" in
Common Lisp. This is a dynamically-scoped variable which acquires a
value on entry to a lexical scope, and when that scope terminates, its
previous value (or none) is restored. We also intend to support thread
inheritance for scope locals, so that parallel constructs can easily
set a value in the outer scope before threads start.

One useful way to think of scope locals is as invisible parameters
which are passed to every method. However, we will only allow values to
propagate from caller to called methods: assignments to a scope local
variable in a called method will not be observed in the calling method,
because when a scope terminates, scope locals set in that scope will
have their previous values restored.

So why not use ThreadLocals for this?  It's mostly a question of
scalability and security. Because values bound to scope locals are
immutable (more later) a set of scope-local bindings can be copied by
reference without any multi-threading concerns. When a sub-thread is
started, it is only necessary to copy a pointer to the parent thread's
scope locals, rather than deep-copy the parent's ThreadLocalMap, as has
to be done with ThreadLocals. This is simply too slow and uses too
much memory for Loom, which may have millions of Threads in fly at a
time.

We'd like to implement this without any changes to the Language or
Virtual Machine specifications, so a language-centric design like
this isn't going to happen:

    scopeLocal MyType x;

    {
        x = value;
        ... code that eventually calls foo()

    }

    void foo() {
        ... code that uses x ...
    }

I've been kicking around various ideas and implementations of scope
locals for some time, and I've now got something to share with the
wider world, somewhat inspired by Haskell syntax, which looks
something like.

    let x = expr1
        y = expr2
    in
        ... expression that uses x, y ...

Translated into something a bit more Java-esque, that looks like

  // Declare to scope locals x and y
  static final ScopeLocal<MyType> x = ... ;
  static final ScopeLocal<MyType> y = ... ;

  {
    ScopeLocal.set(x, expr1)
              .set(y, expr2)
              .in(() -> ... code that uses x and y ...)
  }

I quite like the way this looks, and it allows for some interesting
and useful variants. In particular, the result of the "binding
expression" (a list of key/value pairs):

    ScopeLocal.set(x, expr1)
              .set(y, expr2)

can be passed to threads, saved for later use, and so on. We'll still
enforce the rule that bindings cannot be *used* except inside the in()
method.

There are some disadvantages to this approach, in particular that this
method chaining can result in the creation of a bunch of temporary
objects. Insisting on the use of a Lambda (a Runnable or a Callable)
is, however, necessary in order strictly to enforce the property that
a binding acquires a value on entry to a lexical scope, and when that
scope terminates, its previous value (or none) is restored.

I'm not entirely sure about the naming here. set() is given a
different meaning from usual, and I'm not sure anyone would let me get
away with using let(). in() is slightly odd too, and perhaps something
more verbose would be better.

Other possibilities are e.g.

    ScopeLocal.bind(x, expr1)
              .bind(y, expr2)
              .exec(() -> ... code that uses x and y ...)

Thank you for reading this far; comments and questions welcome.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671