lazy statics design notes

Fri Mar 1 09:47:28 UTC 2019

Let me step back a bit.

 From a technical perspective I don't think there's any issue. You get 
condy semantics, which, as you say, is well defined w.r.t. cycles and 
laziness etc.

 From a language perspective I see an issue if we expect, as we said, 
that most people will just jump in and replace 'static' with 
'lazy-static'. That is gonna have all sorts of behavioral incompatibilities.

So, I was looking for a story for - when is it safe to replace an 
existing 'static' with 'lazy-static' ? And the answer seems messy.

Think of it from an IDE perspective: when do I offer the refactoring to 
'lazy-static' to the user? If the answer ends up being along the lines 
of "when the initializer is a literal", I think most people won't even 
bother, and the argument that, since everyone will opt-in into 'lazy' 
thus making <clinit> disappear won't, IMHO, hold very much.

Maurizio

On 01/03/2019 02:14, John Rose wrote:
> On Feb 28, 2019, at 5:30 PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
>> Question: in which category of badness does this belong to?
>>
>> class Foo {
>>     static int I = Foo.J;
>>     static int J = 2;
>>
>>     public static void main(String[] args) {
>>        System.out.println(I); //prints 0
>>        System.out.println(J); //prints 2
>>     }
>> }
>>
>> The language allows forward references to static fields, assuming you use a _qualified_ name (don't ask me why :-)).
> I remember convincing myself long ago that this was semi-justified.
>
>> But I guess this is similar to having a static init calling a static method which returns the value of a non-yet initialized static. In any case, it feels that, if we condify these, we would change semantics, as suddenly "I" will be initialized to 2 too?
> Yes, that seems likely.  A naive account of the semantics
> of lazy statics would be that each lazy static is "really"
> obtained via a static method which serves as its accessor.
> This static method contains something like this:
>
>     public static @Synthetic int get$I() {
>        if (!I$ready)
>           I = (…init expr here…);
>        return I;
>     }
>
> The JMM makes extra demands here, of course, which only
> the JVM (or var handles) can properly satisfy.
>
>> Also, beware of cycles!
>>
>> class Foo {
>>     static int I = Foo.J;
>>     static int J = I;
>>
>>     public static void main(String[] args) {
>>        System.out.println(I); //prints 0
>>        System.out.println(J); //prints 0
>>     }
>> }
>>
>> I think a condified translation would throw or run into an endless loop?
> Condy has a "no cycle" clause in its contract which we
> can just reuse.  You get something like a SOE.
>
> The naive semantics can model this by tracking
> states more carefully on the I$ready variable.
>
>> To me, the crux of the issue is to draw a crisp line between things that can be condified and things that cannot (and have to fallback to <clinit>). But, the more I think, the less I'm convinced that such a line exist, or at least that a 'meaningful' one exists.
> Every class gets one phase-point which is executed with
> mutual exclusion relative to any other access to the class.
> Programmers use this phase-point in a zillion ways.
> Surely there is no crisp characterization of all its uses.
>
> What we *can* do is tell programmers that, if they
> don't need a phase-point (or weren't even conscious
> that they were executing one), they can use lazies
> and get faster startup.
>
>> Since all method calls are potentially cycle-inducing (or forward-reference-inducing), lazy statics treatment cannot apply to an initializer that has a method call? And, you can have forward references or cycles through field access too (as above)... so, that seems to leave just simple constants, which doesn't seem to offer a lot of bang for the bucks? Am I missing something?
> Any Turing-capable programming language is "potentially
> cycle-inducing" and finding which programs cycle is undecidable.
> We live with it.  I don't see how your reasoning applies here in
> a special way to lazies.
>
> This reminds me why the static variables have the restriction
> that one can't refer to the other if it's later in the file.  It's not
> because there is no possible use for this (hence the Foo.J escape)
> but because most code benefits from a gentle static error check
> that helps the programmer prove that the uninitialized values
> of variables are not being used.
>
> (For local variables, the DU/DA rules perform the same check,
> more strictly.)
>
> Lazy statics will benefit from the same gentle checks.  There
> will  be some legitimate occasions to use the Foo.J escape
> to create a lexical loop that you (the human programmer)
> know won't turn into a dynamic loop.  Better yes, if you
> don't use the Foo.J escape, then you know your lazies
> won't loop.  So I think these features hang together.
>
>> Sidebar: as a VM machinery, I'd love to see something like DynamicValue - in fact I was talking to Sundar the other day that, if we had it, we could solve some thorny issues we have on Panama/jextract bytecode generation. But as a language construct, this feels shaky, but maybe that's just me?
> Yes, I want DynamicValue sooner rather than later.
>
> The potential looping doesn't bother me.
>
> The syntax paradigm of stuffing everything into a field initializer
> is annoyingly restrictive, but will be good enough to start with.
>
> We can relax it later, I think, about the same time we do the
> non-static version of the feature.  What I mean is that some
> use cases for lazies has a use-site formulation, where (within
> the class at least) each use of a lazy potentially comes with a
> proposed value; there need not be a centralized point (the
> def-site of the lazy) where the lazy's value is defined.  This is
> true less for statics and more for non-statics.
>
> Arguably, a use-site lazy mechanism is a wholly separate
> language feature, but I think they should be lumped if possible.
>
> And I think it's possible; that a use-site lazy generalizes
> a def-site lazy in the sense that the central def-site
> value (if any) is a default to be overridden in context
> by the proposed use-site value (if any).
>
> — John
>
> P.S. I suppose there is rare legitimate code you might write where
> the lexical dependency of statics has a loop, which the dynamic logic
> of the program breaks.
>
> (Note that programs have loops of all sorts, and we trust them
> to break the loops dynamically even when we can't prove statically
> that they terminate.)
>
> Here's a silly example:
>
> static final int I = (J_FIRST ? (computeIFromJ(Foo.J = computeJ()) : computeI()) ;
> static int J = (J_FIRST ? Foo.J : computeJ());
>
> This is what "static { }" blocks and blank finals are for.
>
> The initial version of the Java language, which specified
> the reference checks on statics in static initializers,
> also supported the "Foo.J" oddity, and it *didn't* have
> blank finals and their associated definite assignment rules.
> Those extra rules augment the initialization checks
> by allowing a static final to omit its initializer but
> requiring an eventual initialization somewhere.
>