FAQ: why no user-defined default instances?

John Rose john.r.rose at oracle.com
Wed Apr 19 22:51:35 UTC 2023


This is a recurrent question, so I wrote an answer.

https://cr.openjdk.org/~jrose/values/why-zero-defaults.html

To avoid link problems, here is the markdown source as well:

--------

Question:  Why not just allow me, the Valhalla user, to define my own 
default instance for my class?

After all, for a `Person` value class, the `String name` field should be 
an empty string by default, not `null`.  And for a `RationalNumber` 
value class, the `int denominator` of the default instance should be `1` 
not `0`, and if it is a `BigInteger` it should be `BigInteger.ONE`.  Why 
is Java refusing to give me control over those decisions?

Since the default instance is derived from a special constructor in the 
class declaration, why can't that allow the user to specify a different 
(not all-zero) default instance, as an ad hoc constructor body?  How 
hard can it be for a class to collect some default values for fields, 
and then “stamp out” their bit pattern into every new uninitialized 
variable of the class?  Isn’t it an unwelcome burden for users to cope 
with a mandated default value for their class?

Answer:  After long discussion and multiple re-evaluation, the Valhalla 
design time will support default values already naturally present in the 
Java language, but not others.  In short, fields will always start out 
with their natural zero value (which is `null` for references).  
Valhalla values can expose their own natural default values (bundles of 
zero field-values) but no other defaults.

This is a compromise of complexity versus expressiveness.  (Such 
compromises are typical in Java.)  There are a number of reasons for 
this one.

1. The complexity at the VM level does not seem to be justified since 
there are simple workarounds at the user level.  (See below for notes on 
workarounds and VM complexity.)

2. “Stamping out” zeroes is fundamentally simpler to do than 
“stamping out” some other pattern.   Especially managed references:  
If you “stamp out” GC-managed references, you have to have a 
store-barrier conversation with the GC at each reference, if the GC is 
designed that way.  This factor all by itself will slow array creation 
down, no matter what other design decisions are taken.

3. Also, the all-zero value will be visible in some states, even though 
we are trying to make it disappear.  At least while initial field 
initializers are executed, and probably at other times as well. That 
will lead to confusion.

Note that none of the options being discussed here makes any use of 
field initializers, because those apply to all constructors.  A new form 
of field initializer would be required to create a low-ceremony syntax 
for non-zero default instances.

The VM-defined default field values are admittedly not suitable, in many 
cases.  A class abstraction author has a right and duty to define and 
enforce the valid states of fields, and that may not include nulls or 
zeroes in the class fields.  In refusing to supply user-defined default 
field values, Valhalla requires users to employ workarounds for 
unsuitably initialized fields.

1. The simplest workaround is not to expose the default value.  Allow 
`null` to be the default value of your class, just as with all 
pre-Valhalla classes.  The standard workarounds for the `null` value 
apply in all cases.  Valhalla can succeed even if it doesn’t fully 
solve all problems with `null`.

2. A field being null (or some other zero) can be kept inside the class 
abstraction by defining an access method.  There is no particular reason 
why class fields must be made accessible to the public.  Instead, an 
access method can detect the unsuitable state (often `null`) and replace 
it with a better state, defined by the class author.  This involves a 
little more ceremony (a test and branch) than offered by a special 
syntax (one-time assignment) but it works.  It is of course a workaround 
used even before Valhalla.

3. As a variation of the test-and branch workaround, a numeric zero 
value, if unsuitable, can be converted using exclusive OR to a different 
value, yielding code which is probably as fast as a raw field read.  The 
class would define a private static final constant `FDV` of the 
preferred field default value, and encode and decode field values 
appropriately, as `this.f0=f^FDV` and then `return f0^FDV`, where `f0` 
is the zero-default field in the class’s private implementation.

We could allow an ad-hoc constructor body written by the user to 
“poke” non-default values into fields, and run it exactly once, 
capturing the instance state to use for all future default 
initializations.  But this would have a number of problems.

1. The initial value of `this` is in fact has to be an all-zero default, 
which means the `aconst_init` bytecode, used at the beginning of all 
constructors including the initial one, must always yield the all-zero 
default, regardless of what the language says.  Therefore, there is 
always a risk of the all-zero default showing up later, due to an 
insufficiently protected use of `aconst_init`.

2. If the ad-hoc constructor body itself builds arrays or other objects 
requiring the default value, there is a vicious circularity, requiring 
new JVM specification language.

3. The ad-hoc constructor body must be assigned, by the JVM 
specification, an order to execute relative to the whole of `<clinit>`, 
and this again is not trivial to specify or implement.  Normally, object 
constructors run after a class is fully initialized (unless they are 
initiated during  the `<clinit>` activity itself).  But in this case an 
object constructor, just the one special one, would presumably have to 
run either before all `<clinit>` activity (as an early initialization 
step) or else lazily in a separately specified and implemented phase of 
class setup.

In any case, array creation performance is likely to take some kind of 
performance hit.  The issue is that a every array type will have to 
store the bit pattern to “stamp out” when creating an array that has 
an element class with a non-zero default.  One might think that this is 
a pay-as-you go feature, incurring cost only to classes which default 
non-zero defaults, but that it not completely true.  There are places in 
the JVM and JDK where arrays (and other objects) are created from 
dynamically selected types.  Such places are likely to need a new 
slow-path branch to handle the possibility that the selected type 
requires special special handling for a non-zero default.  Tests and 
branches are cheap but it cannot be assumed that they are free.

As noted above, managed references are a particularly difficult cost to 
control, whe considering the initialization of non-zero defaults, 
particularly for arrays.  It is somewhat dispiriting to contemplate a 
flurry of GC activity to create the default state of an array, 
immediately followed by a copy of non-default values, initiating a 
second flurry on the same locations.  One might avoid this particular 
problem by allowing only primitive values to be “stamped out”, but 
that reduces the utility of a user-defined default, and forces users 
into workarounds anyway.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-experts/attachments/20230419/406a12ab/attachment.htm>


More information about the valhalla-spec-experts mailing list