lazy statics design notes

Thu Feb 28 10:00:01 UTC 2019

In term of semantics, i don't think transitioning from T.default, i.e the @Stable semantics is the 'right' semantics.

Here is my current mind model about the relation between indy, condy, static lazy final and Brian's forward bridges.

             |  use site       |  declaration site
             |                 |
method call  |  invokedynamic  |  Brian's forward bridges  
             |                 |
field        |  invokedynamic  |  Brian's forward bridges
             |                 |
constant     |  ldc condy      |  getstatic + DynamicValue + condy
             |                 |

static lazy final is lazy because it shares the same late late binding semantics as indy and condy, i.e it uses the CAS of condy so it doesn't need a specific CAS from T.default.
I think that trying to come that will encompass lazy instance field and lazy static field is a trap, because in the lazy instance case there is no constant pool to store the value.
Given that we already have condy, i prefer to see the lazy static field as a way to provide a symbolic name to get the condy value from outside the class that store the condy.

I believe the static lazy final is a kind of virtual static field, not virtual because there is a virtual dispatch, but virtual in the sense that there is no memory associated with this field in the part of the class that store the static fields value, because the value of the field is stored in the constant pool. Or said differently, Constant dynamic stores the value in the constant pool and a getstatic on a static lazy final is a way to retrieve that value using a symbolic name. 
In term of implementation, getstatic + DynamicValue + condy is really close to a ldc condy but there is first an access check and it can trigger clinit then instead of returning the value of current_constant_pool[condy_index] like condy, it returns the owner_of_getstatic_constantpool[condy_index].

For the interpreter, a getstatic on a static lazy final field can be quickened into a kind of "long ldc condy" once the access check and class init is done.

Rémi

----- Mail original -----
> De: "John Rose" <john.r.rose at oracle.com>
> À: "Karen Kinnear" <karen.kinnear at oracle.com>
> Cc: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Mercredi 27 Février 2019 21:33:30
> Objet: lazy statics design notes

> On Feb 27, 2019, at 7:30 AM, Karen Kinnear <karen.kinnear at oracle.com> wrote:
>> Subject: Valhalla EG notes Feb 13, 2019
>> To: valhalla-spec-experts <valhalla-spec-experts at openjdk.java.net>
>> ...
>> III. [Remi Forax] DynamicValue attribute
>> Another project Remi will lead and create JEP
>>  language level: static lazy final
>>  improve startup by allowing init with Condy at first access of individual static
>> 
>> Drawbacks: opt-in at source
>>  change in semantics
>>  in static block - there is a lock
>>  condy BSM can execute multiple times
> 
> I was just talking with Vladimir Ivanov about lazy
> statics.  He is working on yet another performance
> pothole with <clinit>, generated by Clojure this time.
> (It's not their fault; the system had to clean up a problem
> with correct initialization order, and <clinit> execution
> is over-constrained already, so the JIT has to generate
> more conservative code now.)
> 
> I believe lazy statics would allow programmers
> (and even more, language implementors) to
> use much smaller <clinit>s, or none at all,
> in favor of granular lazy statics.
> 
> So, here's a brain dump, fresh from my recent
> lunch with Vladimir:
> 
> Big problem #1:  If you touch one static, you buy
> them all.  Big problem #2:  If any one static
> misbehaves (blocking, bad bootstrap), all statics
> misbehave.  Big problem #3:  If <clinit> hasn't
> run yet, you need initialization barriers on all
> use points of statics; result is that <clinit> itself,
> and anything it calls, is uniquely non-optimizable.
> Big problem #4:  After touching one static, the
> program cannot make progress until the mutex
> on the whole Class object is released.  Big problem
> #5: Setting up multiple statics is not transactional;
> you can observe erroneous intermediate states during
> the run of the <clinit>.  Big problem #6:  Statics
> are really, really hard to process in an AOT engine,
> because nearly every pre-compiled code path must
> assume that the static might not be booted up yet,
> and if boot-up happens (just once per execution)
> it invalidates many of the assumptions the AOT
> engine wants to make about nearby code.
> 
> Solutions from lazy statics:  Solution #1: If you touch
> one that's the one you buy (plus what's in the vestigial
> <clinit> if there is one at all).  Solution #2: Misbehaving
> statics don't misbehave until they are used (yes, bug
> masking, boo hoo).  Solution #3: Initialization barriers
> are trivial:  Just detect the T.default value of the variable.
> Solution #4: There is no mutex, just a CAS at the end
> of the BSM for the lazy static; no critical section.
> Solution #5:  The CAS at the end of the BSM is inherently
> transactional.  Solution #6: AOT engines can generate
> somewhat simpler fast-path code by just testing for
> T.default; the slow-path code is still hard to optimize,
> but the limits are from the complexity of the BSM
> that initializes the lazy static, not the total complexity
> of the <clinit> code.
> 
> Objection: What if you *want* a mutex?  I didn't like
> the JVM blocking everything in <clinit> but I don't
> want a million racing threads computing the same
> BSM value either.  Ans: Fine, but make that an opt-in
> mechanism, by folding some kind of flow control
> into the relevant BSM, for your particular use case.
> The JVM doesn't have to know about it.
> 
> Objection:  What if I want several statics to initialize in
> one event, with or without mutex or transactions?
> Ans: Easy, just have the BSM for each touch the others,
> or run a common BSM that sets everything up (and then
> returns the same value).  (Note: At the cost of an
> idempotency requirement during lazy init.)  In the
> most demanding cases, define a private static nested
> class to serialize everything, which is today's workaround.
> 
> Objection:  Those aren't real statics, because you can't
> set them to their T.default values!  Ans:  They are as
> real as you are going to get without creating lots of
> side metadata to track the N+1st variable state, which
> is a cost nobody wants to pay.
> 
> Objection: But I do want to opt into the overhead and
> you aren't giving me my T.default; I need the full range
> of values for my special use case.  Ans:  Then add an
> indirection for your use case, to a wrapped copy of your
> desired value; the null wrapper value is the T.default in
> this case.  It's at least as cheap as anything the JVM would
> have done intrinsically.
> 
> Objection:  You disrespect 'boolean'.  It only has one
> state left after you filch 'false' to denote non-initialization.
> My VM hack can do much better than that.  Ans:  Let me
> introduce you to java.lang.Boolean.  It has three states.
> 
> Objection:  What if someone uses bytecode to monkey
> with the state of my lazy static?  Your design is broken!
> Ans:  This is the sort of corner case that needs extra
> VM support.  In this case, it is sufficient to privatize
> write access to a variable, even though it may be public,
> to its declaring class.  You can trust the declaring class
> not to compile subverting assignments into itself,
> because javac won't let it.
> 
> Objection:  I can't imagine the language design for this;
> surely there are difficulties you haven't foreseen.  Ans:
> Neither can I, and there certainly are.  The sooner we
> start trying out prototypes the sooner we'll shake out
> the issues.  There are several things to try:
> 
> http://openjdk.java.net/jeps/8209964
> http://cr.openjdk.java.net/~jrose/draft/lazy-final.html
> 
> Bonus:  The T.default hack scales to non-static
> fields as well.  So laziness is a separable tool
> from the decision to make things static or not;
> it survives more refactorings.  The technique
> is abundantly optimizable (both static and
> non-static versions) as proven by the good
> track record of @Stable inside the JDK.  We
> should share this gem outside the JDK,
> which requires language and (more) VM
> support.  Language design issue:  It's easier
> to do the lazy static with an attribute than
> doing the lazy non-static; you need an
> instance-specific callback for the latter.  TBD.
> 
> The nice thing about this is that the OpenJDK JITs
> have been making good use of @Stable annotations
> for a long time.  So the main problem here is finding
> a language and VM framework that legitimizes this
> sort of pattern (including safety checks and rule
> enforcement on state changes).  When that is done,
> the JITs should make use of it with little extra effort.
> 
> — John