Scoped variables

dean.long at oracle.com dean.long at oracle.com
Mon Dec 3 21:25:36 UTC 2018


My prototype does not allow arbitrary methods to bind a frame/scope local.
Instead, only one method is allowed to do that:

public class FrameLocal<T> {

     public void doEnclosed(Runnable r) { ... }

then inside the Runnable you can call set() to assign a value.  If we 
only wanted
to allow an intial value, we could do somethine like:

     public void doEnclosed(T newValue, Runnable r) { ... }

For simplicity each variable needs to do its own doEnclosed call, but I can
imagine also having a static method that takes a list of variables.

My prototype has both deep and shallow "binding".  The deep binding 
stack walk
looks for special StackWalkCookieHolder.doEnclosed frames that
FrameLocal.doEnclosed uses.  The shallow binding uses both a stack and a 
cache.
The per-variable stack keeps track of the most recent binding in the current
Continuation or Thread (so it is associated with a call stack). 
Currently the stack
is kept when the continuation is unmounted.  The cache keeps track of 
the stack
that has the most recent binding.  This is because the most recent 
binding could
be in a parent continuation or parent thread.  My prototype does not 
support a
global binding as the parent to a thread binding.  Without fibers, 
continuations, or
global bindings, then the most recent binding reduces to the thread 
binding, and
there shouldn't be any need for a cache.

The cache is a map and is reset every time the continuation is 
mounted/unmounted.

The stack is looked up in a WeakHashMap, making it basically equivalent to a
ThreadLocal.  If we had long-lived frame locals, then perhaps we could 
simply look
them up in an array, but then why not use a long-lived thread local 
instead that
could also be looked up in an array?

To reduce Continuation storage requirements, we could choose not to 
retain the
stacks after an unmount, but instead lazily rebuild using stack walks.  
Right now
I'm only using the deep stack walk to verify the results of the cache + 
stack
lookup.

dl


On 12/3/18 11:55 AM, John Rose wrote:
> In part we are drawing on the very old idea of a Lisp "SPECIAL" variable.
> Such a variable is bound by a "LET" construct, which after compilation will
> be assigned to a stack frame, and managed within the lifetime of that frame.
> The thing that is special about such a variable is that, given the name of the
> variable, it can be accessed like a global, from blocks of source code not
> directly nested inside the "LET" that binds it.  A non-special variable, aka
> a lexically scoped variable, is accessible only within the "LET" that binds it.
> Lexical scoping is the rule we are all accustomed to in C, Java, etc.
> The scoping of "SPECIAL" variables is sometimes called dynamic scoping.
> The variables in the Unix shells behave as if dynamically scoped, in that
> you can say "$x" inside a block of code that didn't bind "x", and still see it.
>
> In Lisp, modularity or privacy of a binding can be obtained by using a
> namespaced symbol, or an uninterned symbol.  Java would manage
> such concerns using object APIs.
>
> Since the JVM (if not the Java language) has a clear idea of stack frame,
> it makes sense to adapt the Lisp concepts to be directly linked to frames.
> Thus, a Lisp SPECIAL is the older brother of a Java frame local.
>
> Anyway, that's the background, as I see it.
>
> (Caveat:  None of these observations are intended to suggest that the Java
> source language should introduce dynamically scoped variables, any more
> than today's ThreadLocal should introduce a special kind of Java variable.
> Java's scoping rules are complicated enough as they stand, and Java's object
> model is sufficient for expressing dynamically bound constructs.  Lisp
> "SPECIALs" were originally the only kind of variable available, back in the
> days when scoping was a research topic.  Java can learn from history
> and choose the better alternative where Lisp was saddled with both.)
>
> (And why not "just do" fiber local variables like today's thread locals?
> That's an understandable starting point, but it would cause fibers with
> locals to acquire a permanent footprint cost, since there's no way for
> the JVM to prove that a fiber local goes out of scope and can be
> deleted.  Java threads swim around with an accumulation of thread
> local footprint, like a whale shark dragging a colony of lampreys.
> Those same lampreys would overwhelm a goldfish.  It's a problem
> of scaling.  So old-fashioned Lisp "SPECIALs", or some other kind
> of frame-based local variable, are where you get when you finish
> properly adapting the idea of thread local to Java fibers.)
>
> At the implementation level, such as compiled code, the three basic operations
> for dynamically scoped variables are creating a binding, destroying a binding,
> and looking up the current binding.  Lookups are inherently harder to optimize
> because the stack frame that does the lookup is not always the stack
> frame that holds the binding.  So there's a search involved.
>
> (Lookups are done for both reads and writes of a SPECIAL variable.
> For Java it's an open question whether frame locals should be writable
> at all, since existing Java uplevel variables are final, and that works just
> fine for us.  You can always add an indirection to a mutable box.)
>
> Two more basic operations usually pop up when you add threads:  When you
> park a thread your implementation might choose to somehow move the bindings
> aside out of the way, and when you unpark a thread you would then want
> to move the bindings of that thread back into the foreground, where they
> can be more easily accessed.
>
> There's an old literature of how to implement the search for a dynamically
> scoped variable binding.  Much of it precedes our modern notions of threading.
> But the basic trade-offs are still relevant, between what has been called
> "deep binding" and "shallow binding".   These are not semantic features
> but implementation tactics.  Shallow binding places the binding in a well
> known location (such as a global or static variable), which compiled code can
> easily access.  Deep binding places the binding near the stack frame that created
> the binding, requiring a search among the candidate frames starting with
> the most recently created frames.  Shallow binding is not thread safe unless the
> well known location is somehow localized or specialized to the thread.
>
> Although shallow binding seems like an easy win over deep binding,
> shallow binding can slow down thread parking and unparking, since
> all the global variables used by the relevant thread need to be edited.
> So the tradeoff is the cost of ownership of a binding, versus the cost
> of lookup.  Clever implementations will try to balance both costs, and
> pay them only when they are necessary.  A search for deep bindings
> can sometimes be made faster by indexing or hinting, or clever
> tabular organizations which approximate the characteristics of
> shallow bindings, as hash tables approximate arrays.
>
> Deep binding is a win when binding lookup is relatively infrequent
> compared to binding creation and thread switching.  Deep binding
> can be viewed as a lazy evaluation tactic, where the costs of finding
> a binding are deferred until the value is actually looked up; conversely
> shallow binding is an eager technique for lookup.  It's easy to create
> hybrid techniques, such as memoization of lookups (making the
> second one look more "shallow"), or storing hints in thread local
> and/or global tables.
>
> For fibers, since park and unpark are high frequency operations, pure
> shallow binding might well be a lose, at least for workloads with large
> numbers of dynamically scoped variables.  But fibers naturally have
> an analog of thread park/unpark operations, and these events in
> the life cycle of the fiber can be tasked with moving around table
> pointers and/or hints for optimizing lookup of bindings.
>
> Also, fibers have a small amount of control state to which might be
> added additional tables or hints which would survive threading events,
> and allow a fiber to quickly locate frame-local bindings regardless
> of its recent history of threading.  Using fiber-local control state has
> two challenges:  First, it should be kept compact in footprint, like the
> fiber itself.  Second, the cost of editing the state for bindings (which
> happens when bindings are created and destroyed) must not be
> excessive, since that would make method calls expensive, when
> methods bind frame locals.  That said, initial implementations can
> be simple and a little wasteful, as long as there is headroom for
> later optimization, as workloads may require.
>
> If we do this right, a robust and performant mechanism for frame
> locals could let us refactor existing JVM features which today require
> stack walking.  I'm thinking of doPrivileged, of course, and agreeing
> with Andrew's observation about "better thread locals".  The techniques
> could co-exist:  A frame local could define a cache point which would
> be initially empty but later filled (on demand) by a stack walk.
>
> A final observation:  The JVM probably needs to be in the loop with
> frame locals; dynamic scoping is probably not something that can be
> implemented on top of the JVM in pure standard Java.  When the
> JVM is asked to perform a lookup of a frame local, it should immediately
> know which method (out of all the methods which might possibly be
> on stack) is the method which can define that same frame local.
> This implies a registration or numbering of each frame local in the
> metadata for a method.  It also suggests (to me) that each frame
> local should be defined in tandem with a statically defined annotation
> on each method that binds it.
>
> HTH
> — John
>
> On Dec 3, 2018, at 10:24 AM, Ron Pressler <ron.pressler at oracle.com> wrote:
>> Hi Andrew.
>>
>> Yes, scope variables are indeed orthogonal to fibers (although we would like to
>> combine them with structured concurrency).
>>
>> Dean Long has started working on an early prototype, which he’ll present hopefully soon.
>> While scope locals are certainly superior to TLs from a programming perspective,
>> I’m not sure beating them on sheer performance would be so easy. But If you
>> have some ideas on how to do that, they would be most welcome.
>>
>> Ron
>>
>>
>> On December 3, 2018 at 4:49:47 PM, Andrew Haley (aph at redhat.com) wrote:
>>
>> We discussed "scoped variables" as a possible fiber-oriented
>> replacement for thread locals. I have some interest in this area, so I
>> was wondering if anyone had started to work on this. If not, I'll
>> volunteer to propose a design.
>>
>> These could also be useful even when fibers aren't being used: we'd
>> want something more efficient than ThreadLocals, but that's a very low
>> bar. So, please let me know.
>>
>> --
>> Andrew Haley
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://www.redhat.com>
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



More information about the loom-dev mailing list