lazy statics design notes
John Rose
john.r.rose at oracle.com
Fri Mar 1 02:14:11 UTC 2019
On Feb 28, 2019, at 5:30 PM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
>
> Question: in which category of badness does this belong to?
>
> class Foo {
> static int I = Foo.J;
> static int J = 2;
>
> public static void main(String[] args) {
> System.out.println(I); //prints 0
> System.out.println(J); //prints 2
> }
> }
>
> The language allows forward references to static fields, assuming you use a _qualified_ name (don't ask me why :-)).
I remember convincing myself long ago that this was semi-justified.
> But I guess this is similar to having a static init calling a static method which returns the value of a non-yet initialized static. In any case, it feels that, if we condify these, we would change semantics, as suddenly "I" will be initialized to 2 too?
Yes, that seems likely. A naive account of the semantics
of lazy statics would be that each lazy static is "really"
obtained via a static method which serves as its accessor.
This static method contains something like this:
public static @Synthetic int get$I() {
if (!I$ready)
I = (…init expr here…);
return I;
}
The JMM makes extra demands here, of course, which only
the JVM (or var handles) can properly satisfy.
> Also, beware of cycles!
>
> class Foo {
> static int I = Foo.J;
> static int J = I;
>
> public static void main(String[] args) {
> System.out.println(I); //prints 0
> System.out.println(J); //prints 0
> }
> }
>
> I think a condified translation would throw or run into an endless loop?
Condy has a "no cycle" clause in its contract which we
can just reuse. You get something like a SOE.
The naive semantics can model this by tracking
states more carefully on the I$ready variable.
> To me, the crux of the issue is to draw a crisp line between things that can be condified and things that cannot (and have to fallback to <clinit>). But, the more I think, the less I'm convinced that such a line exist, or at least that a 'meaningful' one exists.
Every class gets one phase-point which is executed with
mutual exclusion relative to any other access to the class.
Programmers use this phase-point in a zillion ways.
Surely there is no crisp characterization of all its uses.
What we *can* do is tell programmers that, if they
don't need a phase-point (or weren't even conscious
that they were executing one), they can use lazies
and get faster startup.
> Since all method calls are potentially cycle-inducing (or forward-reference-inducing), lazy statics treatment cannot apply to an initializer that has a method call? And, you can have forward references or cycles through field access too (as above)... so, that seems to leave just simple constants, which doesn't seem to offer a lot of bang for the bucks? Am I missing something?
Any Turing-capable programming language is "potentially
cycle-inducing" and finding which programs cycle is undecidable.
We live with it. I don't see how your reasoning applies here in
a special way to lazies.
This reminds me why the static variables have the restriction
that one can't refer to the other if it's later in the file. It's not
because there is no possible use for this (hence the Foo.J escape)
but because most code benefits from a gentle static error check
that helps the programmer prove that the uninitialized values
of variables are not being used.
(For local variables, the DU/DA rules perform the same check,
more strictly.)
Lazy statics will benefit from the same gentle checks. There
will be some legitimate occasions to use the Foo.J escape
to create a lexical loop that you (the human programmer)
know won't turn into a dynamic loop. Better yes, if you
don't use the Foo.J escape, then you know your lazies
won't loop. So I think these features hang together.
> Sidebar: as a VM machinery, I'd love to see something like DynamicValue - in fact I was talking to Sundar the other day that, if we had it, we could solve some thorny issues we have on Panama/jextract bytecode generation. But as a language construct, this feels shaky, but maybe that's just me?
Yes, I want DynamicValue sooner rather than later.
The potential looping doesn't bother me.
The syntax paradigm of stuffing everything into a field initializer
is annoyingly restrictive, but will be good enough to start with.
We can relax it later, I think, about the same time we do the
non-static version of the feature. What I mean is that some
use cases for lazies has a use-site formulation, where (within
the class at least) each use of a lazy potentially comes with a
proposed value; there need not be a centralized point (the
def-site of the lazy) where the lazy's value is defined. This is
true less for statics and more for non-statics.
Arguably, a use-site lazy mechanism is a wholly separate
language feature, but I think they should be lumped if possible.
And I think it's possible; that a use-site lazy generalizes
a def-site lazy in the sense that the central def-site
value (if any) is a default to be overridden in context
by the proposed use-site value (if any).
— John
P.S. I suppose there is rare legitimate code you might write where
the lexical dependency of statics has a loop, which the dynamic logic
of the program breaks.
(Note that programs have loops of all sorts, and we trust them
to break the loops dynamically even when we can't prove statically
that they terminate.)
Here's a silly example:
static final int I = (J_FIRST ? (computeIFromJ(Foo.J = computeJ()) : computeI()) ;
static int J = (J_FIRST ? Foo.J : computeJ());
This is what "static { }" blocks and blank finals are for.
The initial version of the Java language, which specified
the reference checks on statics in static initializers,
also supported the "Foo.J" oddity, and it *didn't* have
blank finals and their associated definite assignment rules.
Those extra rules augment the initialization checks
by allowing a static final to omit its initializer but
requiring an eventual initialization somewhere.
More information about the valhalla-spec-observers
mailing list