A class per static field? Why or why not?
Dan Heidinga
heidinga at redhat.com
Mon Dec 12 16:56:20 UTC 2022
On Wed, Dec 7, 2022 at 11:40 AM Brian Goetz <brian.goetz at oracle.com> wrote:
> Thanks, Dan, for sharing the investigation and for asking the right
> questions. A few comments inline.
>
> On 12/7/2022 10:52 AM, Dan Heidinga wrote:
>
> Continuing on the Class init progression discussion....
>
> Why don't we put every static field in its own class?
>
>
> Pedantic correction: we're only talking about static finals with
> initializers. Mutable statics have arbitrarily complicated initialization
> lifecycles, and that's just how it is; static finals that are initialized
> in `static { }` blocks already have their lifecycle complected with other
> writes in those blocks.
>
> The obvious answer is that it's too much mental load for developers. But
> if we put that aside for a moment, and assume that we have infinitely smart
> developers, it might be useful to understand why we don't program like this
> now. Or what programming like this might actually look like.
>
> Putting every static field in its own class trivially gives us lazy static
> fields (sorry John, no new JEP required in this world) with each static
> only being initialized when actually accessed.
>
> It gives each static field a clear initialization point where we can more
> easily tell what caused a particular static to be initialized.
>
> It makes it easier to determine the true dependency graph between static
> fields rather than today's "soupy" model.
>
>
> Some possible reasons (just brainstorming here):
>
> - It's more code, both at the declaration site (wrap it in a class) and
> the use site (qualify it with a class name). Developers instantly see this
> cost, but it make take longer to see the benefit.
> - Perception that this is more heavyweight, since classes are "obviously"
> more heavyweight than variables.
> - Thinking about lifecycles is hard. If the easy thing -- declare a
> bunch of statics and initialize them -- works, this is what developers will
> do, and are unlikely to revisit it until something doesn't work.
> - More importantly, lifecycle mostly becomes relevant when your code is
> used in a bigger system, and at coding time, that's a distant-future
> worry. Like other crosscutting concerns such as concurrency and security,
> thinking about deployment / redeployment / startup characteristics is hard
> to focus on when you're trying to get your code to work, and its easy to
> forget to go back and think about it after you get your code to work.
>
> So, I think the answer is: people follow the path of least resistance, and
> the path of least resistance here leads to someplace "good enough" to get
> things working but which sows the seed for long-term technical debt. The
> PoLR today is good enough that people can get to something that mostly
> works without thinking very hard. If we can make the PoLR lead someplace
> better, that's what winning will look like.
>
>
+1. One additional challenge here is the deployment model affects the end
destination for the path. Paving the PoLR to make lazy init more common /
easier can result in making earlier init (build time or shifted early) more
difficult. And vice versa.
The PoLR should ideally lead developers to say "as early as possible (build
time) or as lazy as possible, I don't care which" so the VM has as much
freedom as possible. Really what they often want to say is "don't affect
my startup time with this operation" but don't have a good way to express
that both early & late are valid solutions.
> It doesn't solve the "soupy" <clinit> problem as developers can still do
> arbitrary things in the <clinit> but it does reduce the problem as it moves
> a lot of code out of the common <clinit> as each static now has its own
> <clinit>. Does this make analysis more tractable?
>
>
> I agree with your (implicit) intuition that if we could get to a world
> where we only complected initialization lifecycles rarely, rather than
> routinely, then it would be more practical to characterize those as
> "weirdo" cases for which the answer is "rewrite/don't use that code if you
> want <benefit X>". The problem today is that way too much code uses the
> existing soupy mechanisms -- but only some smaller fraction of it, which is
> hard to identify either by human or automated analysis, implicitly depends
> on the initialization-order semantics of the existing mechanisms.
>
> In our investigation [0], we focused on the underlying JVM physics of
> classes and looked at the memory use of this approach. Which was estimated
> to average out to under 1K per class.
>
>
> Semantics and boilerplate aside, this seems amenable to a "Loomy" move,
> which is: "make the expensive thing less expensive, rather than asking
> users to resort to complex workarounds."
>
> What do other languages do with their equivalent of static state? Are
> there different design points for expressing static state we should be
> investigating to better enable shifting computation to different points in
> time?
>
>
> One of the things that accidentally makes our lives harder here is that
> most other languages do not specify semantics as carefully as Java does, so
> the answer is sometimes "whatever the implementation does." For better or
> worse, Java is much more precise at specifying what triggers class
> initialization.
>
> Looking at the most Java-like languages:
>
> - C# allows members to be declared static, supports field initializers
> like Java, and supports "static constructors" (similar to `static { }`
> blocks in Java, but with a constructor-like syntax) which are run at class
> initialization time. If a static constructor is present, it does the same
> soupy thing, where field initializers are run in textual order prior to
> running the static constructor; if no static constructor is present, the
> spec is cagey about when static field initializers are run, but they appear
> to all be run in the textual order:
>
> 14.5.6.2 Static field initialization
> The static field variable initializers of a class correspond to a sequence
> of assignments that are executed in the textual order in which they appear
> in the class declaration (§14.5.6.1). Within a partial class, the meaning
> of “textual order” is specified by §14.5.6.1. If a static constructor
> (§14.12) exists in the class, execution of the static field initializers
> occurs immediately prior to executing that static constructor. Otherwise,
> the static field initializers are executed at an implementation-dependent
> time prior to the first use of a static field of that class.
>
>
> - Scala and Kotlin ditched "static" as a modifier, instead offering
> "companion objects" (singleton classes). While the two models are equally
> expressive, companion objects have us syntactically segregate the static
> parts of a class into a single entity, and encourage us to think about the
> static parts as a whole rather than individual members.
>
> Kotlin:
> class X {
> companion object {
> // per-class fields and methods here
> }
> }
>
> Members of the companion object can be qualified with the class name, or
> used unqualified, just as in Java.
>
> Scala lets you declare something similar as a top level entity:
>
> class X { ... }
> object X { ... }
>
> with more complex rules that treat a class and an object with the same
> name as being two facets of the same entity. (You can have an object
> separate from a class; it's just a class whose members are effectively
> static and which is initialized the first time one of its members is
> accessed.)
>
> The approach of companion objects rather than static members provides a
> useful nudge to thinking of the static parts of a class as being a single,
> independent entity.
>
>
Independent entity, yes. Single, maybe. We group the statics of a class
into a single <clinit> today but we may want multiple groupings if we can
give them different initialization points (lifecycles). Maybe that's just
putting them in a different class but whatever we pick here will affect the
PoLR discussed above and "single" may not be the right model given the
classes we already have.
--Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20221212/ae80d393/attachment.htm>
More information about the leyden-dev
mailing list