Field initialization before 'super'

Wed Dec 13 00:27:24 UTC 2023

In Valhalla we've been building on the language changes in JEP 447 (Statements Before Super) to move towards a more safe and reliable programming pattern for initializing final fields. Some of these ideas could make their way into the next iteration of Statements Before Super, to be further augmented with Value Classes (JEP 401).

Two key observations:

- Inside <init> methods, the JVM allows writes to instance fields of "uninitialized" objects, before the 'super()' call. (In fact, javac has long used this capability to initialize fields that store captured state of inner classes.)

- When a 'final' field is written before the 'super()' call, it is impossible to observe the field prior to its initialization. Thus, the field can be treated as truly immutable—every 'getfield' on the same instance will return the same value. (In contrast, in existing usage, uses of final fields may observe mutation if the object might still be under construction.)

To enable and take advantage of early field initialization, we've envisioned the following changes:

1) As an exception to the general rule about 'this' usage, a "pre-construction context" allows writes to blank instance fields of the class. (The terminology may need updating, since you're clearly "constructing" the object if you're writing to its fields.) The fields are "write-only" at this stage—you can write into them but can't read them back.

The regular DA/DU rules apply for final fields: they must be initialized exactly once by an initializer or by every 'super()'-calling constructor, whether in the prologue or the epilogue.

At a 'this()' call, all final fields must be DU (because the delegated constructor will perform its own writes). No such restriction is needed for non-final fields; but it's an open question whether we should prohibit all writes before 'this()' anyway.

Writes to non-final fields with initializers are disallowed, to avoid confusion about sequencing (the field initializer will always run later, overwriting whatever you put in the constructor prologue.)

2) If a final field is written before 'super()' via every constructor in the class, it can be considered a "strict final" field. It will never be observed to mutate.

In the class file, ACC_STRICT is repurposed to indicate a strict final field. javac is responsible for identifying strict final fields. Existing early-initialized capture fields can probably be automatically counted as strict finals.

ACC_STRICT implies ACC_FINAL and !ACC_STATIC. Verification ensures that a 'putfield' for an ACC_STRICT field of the current class never occurs after the 'super()' call. (Specifically, the receiver type for the putfield must be 'uninitializedThis', not a class type.)

3) Immutability of strict finals is a strong guarantee. JVM internals may treat strict final fields as truly immutable, without supporting any deopt paths when unexpected mutation occurs.

The 'Field.setAccessible' method, which provides a standard API mechanism for mutating final fields, considers strict finals to be "non-modifiable", and will not enable reflective writes. (It already does the same for record fields.)

Standard deserialization ensures strict finals are set, and so their values deserialized, before the object under construction is leaked to any user code. This probably means back references to an object from its own strict final fields are unsupported, and deserialize to 'null'. (Records already behave in this way.)

Unsafe and JNI are capable of performing arbitrary, type-unsafe modifications to field storage. Clients who modify strict finals do so at their own risk, and JVM optimizations won't try to account for such usage.

-----

That covers "phase 1" for this feature. Eventually, we'll want to address questions like
- What about fields with initializers?
- Can I have my implicit 'super()' call go at the end of my constructor?
- Can javac check for me that my fields are strict?

These sorts of capabilities probably make sense to introduce with value classes, and perhaps retrofit on records. Further design work needed to figure out how to release them for general consumption. All of that can be considered "phase 2", to come later.

But for Statements Before Super, we're just proposing to start with (1), (2), and (3).

I realize (2) and especially (3) are stretching the original concept of this JEP (which was purely language/compiler-oriented). But I think, from end users' perspective, it will all feel like the same feature. If wanted, though, I could see doing those pieces in their own JEP in parallel with Statements Before Super.