<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<font size="4"><font face="monospace">Thanks, Dan, for sharing the
investigation and for asking the right questions. A few
comments inline. </font></font><br>
<br>
<div class="moz-cite-prefix">On 12/7/2022 10:52 AM, Dan Heidinga
wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAJq4Gi5LmowYZ_ernqM7bdTj+5+-h8Ht+DNZmTjOUSsBoR4SfA@mail.gmail.com">
<div dir="ltr">Continuing on the Class init progression
discussion....
<div><br>
</div>
<div>Why don't we put every static field in its own class?</div>
</div>
</blockquote>
<br>
Pedantic correction: we're only talking about static finals with
initializers. Mutable statics have arbitrarily complicated
initialization lifecycles, and that's just how it is; static finals
that are initialized in `static { }` blocks already have their
lifecycle complected with other writes in those blocks. <br>
<br>
<blockquote type="cite" cite="mid:CAJq4Gi5LmowYZ_ernqM7bdTj+5+-h8Ht+DNZmTjOUSsBoR4SfA@mail.gmail.com">
<div dir="ltr">
<div>The obvious answer is that it's too much mental load for
developers. But if we put that aside for a moment, and assume
that we have infinitely smart developers, it might be useful
to understand why we don't program like this now. Or what
programming like this might actually look like.</div>
<div><br>
</div>
<div>Putting every static field in its own class trivially gives
us lazy static fields (sorry John, no new JEP required in this
world) with each static only being initialized when actually
accessed.</div>
<div><br>
</div>
<div>It gives each static field a clear initialization point
where we can more easily tell what caused a particular static
to be initialized.</div>
<div><br>
</div>
<div>It makes it easier to determine the true dependency graph
between static fields rather than today's "soupy" model.</div>
</div>
</blockquote>
<br>
Some possible reasons (just brainstorming here): <br>
<br>
- It's more code, both at the declaration site (wrap it in a class)
and the use site (qualify it with a class name). Developers
instantly see this cost, but it make take longer to see the
benefit. <br>
- Perception that this is more heavyweight, since classes are
"obviously" more heavyweight than variables. <br>
- Thinking about lifecycles is hard. If the easy thing -- declare
a bunch of statics and initialize them -- works, this is what
developers will do, and are unlikely to revisit it until something
doesn't work. <br>
- More importantly, lifecycle mostly becomes relevant when your
code is used in a bigger system, and at coding time, that's a
distant-future worry. Like other crosscutting concerns such as
concurrency and security, thinking about deployment / redeployment /
startup characteristics is hard to focus on when you're trying to
get your code to work, and its easy to forget to go back and think
about it after you get your code to work. <br>
<br>
So, I think the answer is: people follow the path of least
resistance, and the path of least resistance here leads to someplace
"good enough" to get things working but which sows the seed for
long-term technical debt. The PoLR today is good enough that people
can get to something that mostly works without thinking very hard.
If we can make the PoLR lead someplace better, that's what winning
will look like. <br>
<br>
<blockquote type="cite" cite="mid:CAJq4Gi5LmowYZ_ernqM7bdTj+5+-h8Ht+DNZmTjOUSsBoR4SfA@mail.gmail.com">
<div dir="ltr">
<div>It doesn't solve the "soupy" <clinit> problem as
developers can still do arbitrary things in the <clinit>
but it does reduce the problem as it moves a lot of code out
of the common <clinit> as each static now has its own
<clinit>. Does this make analysis more tractable? <br>
</div>
</div>
</blockquote>
<br>
I agree with your (implicit) intuition that if we could get to a
world where we only complected initialization lifecycles rarely,
rather than routinely, then it would be more practical to
characterize those as "weirdo" cases for which the answer is
"rewrite/don't use that code if you want <benefit X>". The
problem today is that way too much code uses the existing soupy
mechanisms -- but only some smaller fraction of it, which is hard to
identify either by human or automated analysis, implicitly depends
on the initialization-order semantics of the existing mechanisms. <br>
<br>
<blockquote type="cite" cite="mid:CAJq4Gi5LmowYZ_ernqM7bdTj+5+-h8Ht+DNZmTjOUSsBoR4SfA@mail.gmail.com">
<div dir="ltr">In our investigation [0], we focused on the
underlying JVM physics of classes and looked at the memory use
of this approach. Which was estimated to average out to under
1K per class.</div>
</blockquote>
<br>
Semantics and boilerplate aside, this seems amenable to a "Loomy"
move, which is: "make the expensive thing less expensive, rather
than asking users to resort to complex workarounds." <br>
<br>
<blockquote type="cite" cite="mid:CAJq4Gi5LmowYZ_ernqM7bdTj+5+-h8Ht+DNZmTjOUSsBoR4SfA@mail.gmail.com">
<div dir="ltr">
<div>What do other languages do with their equivalent of static
state? Are there different design points for expressing static
state we should be investigating to better enable shifting
computation to different points in time?</div>
</div>
</blockquote>
<br>
One of the things that accidentally makes our lives harder here is
that most other languages do not specify semantics as carefully as
Java does, so the answer is sometimes "whatever the implementation
does." For better or worse, Java is much more precise at specifying
what triggers class initialization. <br>
<br>
Looking at the most Java-like languages:<br>
<br>
- C# allows members to be declared static, supports field
initializers like Java, and supports "static constructors" (similar
to `static { }` blocks in Java, but with a constructor-like syntax)
which are run at class initialization time. If a static constructor
is present, it does the same soupy thing, where field initializers
are run in textual order prior to running the static constructor; if
no static constructor is present, the spec is cagey about when
static field initializers are run, but they appear to all be run in
the textual order:<br>
<br>
<blockquote type="cite">14.5.6.2 Static field initialization<br>
The static field variable initializers of a class correspond to a
sequence of assignments that are executed in the textual order in
which they appear in the class declaration (§14.5.6.1). Within a
partial class, the meaning of “textual order” is specified by
§14.5.6.1. If a static constructor (§14.12) exists in the class,
execution of the static field initializers occurs immediately
prior to executing that static constructor. Otherwise, the static
field initializers are executed at an implementation-dependent
time prior to the first use of a static field of that class.</blockquote>
<br>
- Scala and Kotlin ditched "static" as a modifier, instead offering
"companion objects" (singleton classes). While the two models are
equally expressive, companion objects have us syntactically
segregate the static parts of a class into a single entity, and
encourage us to think about the static parts as a whole rather than
individual members. <br>
<br>
Kotlin: <br>
class X { <br>
companion object { <br>
// per-class fields and methods here<br>
}<br>
}<br>
<br>
Members of the companion object can be qualified with the class
name, or used unqualified, just as in Java. <br>
<br>
Scala lets you declare something similar as a top level entity:<br>
<br>
class X { ... }<br>
object X { ... }<br>
<br>
with more complex rules that treat a class and an object with the
same name as being two facets of the same entity. (You can have an
object separate from a class; it's just a class whose members are
effectively static and which is initialized the first time one of
its members is accessed.)<br>
<br>
The approach of companion objects rather than static members
provides a useful nudge to thinking of the static parts of a class
as being a single, independent entity.<br>
<br>
<br>
</body>
</html>