Fwd: nullable-inlined-values on the heap
Brian Goetz
brian.goetz at oracle.com
Thu Jun 30 23:54:12 UTC 2022
From the -comments list.
My comments: this posting feels mostly like a solution without stating
what problem it is trying to solve, so its pretty hard to comment on.
But ...
> Would it be possible to decomplect nullability from a variable's
> encoding-mode (reference or inline)?
Not in reality. A null is fundamentally a *reference* (or the absence
of a reference.) In theory, we could construct the union type int|Null,
but this type doesn't have a practical representation in memory, and
drags in all sorts of mismatches because union types would then flow
throughout the system. So the only practical way to represent "int or
null" is "reference to int." Which is to say, Integer (minus identity.)
> If this is possible, maybe Valhalla's Java could have a user-model
> like this:
You should probably start with what problem you are trying to solve.
-------- Forwarded Message --------
Subject: nullable-inlined-values on the heap
Date: Thu, 30 Jun 2022 23:02:17 +0100
From: João Mendonça <jf.mend at gmail.com>
To: valhalla-spec-comments at openjdk.org
Hello,
Would it be possible to decomplect nullability from a variable's
encoding-mode (reference or inline)?
I have been looking at the C# spec on "nullable-value-types" and I
wonder if the Java runtime could do something similar under the hood to
allow nullable-inlined-values, even on the heap.
I think that, compared to C#'s "value-types", Java can take advantage of
the fact that its value-class instances are immutable, which means that
pass-by-value or pass-by-reference is indistinguishable, which, with
nullable-inlined-values,could mean that Java can have the variable
encoding-mode completely encapsulated/hidden from the user-model as a
runtime implementation detail.
If this is possible, maybe Valhalla's Java could have a user-model like
this:
*** A decomplected user-model ***
For class-authors:
- *value-knob* to reject identity - Applicable on class declarations,
indicates that the class instances don't require identity (a value-class).
- *zero-knob* to indicate that the value-class has a zero-value - if a
value-class does not have a zero-value, its instances won't be inlined
in any shared-variables (§17.4.1.) since this is the only way for the
language to ensure the non-existence of the zero-value. If the
value-class is declared with a zero-value, then care must be taken when
reading/writing constructors since *no constructor invariant can exclude
the zero-value*.
- *tearable-knob* to allow tearing - Applicable on zero value-class
declarations with bitSize > 32 bits, may be used by the class-author to
hand the class-user the responsibility of how to avoid tearing, freeing
the runtime to always inline instances in shared-mutables (non-final
shared-variables). Conversely, if this knob is not used, instances will
be kept atomic, which allows the class-author to guarantee constructor
invariants *provided they're not broken by the zero-value*, which may be
useful for the class implementation and class-users to rely upon.
For class-users:
- *not-nullable-knob (!)* to exclude null from a variable's value-set
- Applicable on any variable declarations. On nullable variables, the
default value is null and, in either encoding-mode (reference or
inline), the runtime is free to choose the encoding for the extra bit of
information required to represent the null state.
- *atomic-knob* to avoid tearing - Applicable on shared-mutable
declarations, may be used to reverse the effect of the tearable-knob,
thereby restoring atomicity.
The encoding-mode of a variable is decided at runtime according to this
ternary expression:
var encodingMode =
!valueClass(variable.type) ? REFERENCE // value-knob
: tooBig(variable.type.bitSize) ? REFERENCE
: !shared(variable) ? INLINE // (§17.4.1.)
: !zeroValueClass(variable.type) ? REFERENCE // zero-knob
: final(variable) ? INLINE
: atomicWrite(variable.type.bitSize) ? INLINE
: atomic(variable) ? REFERENCE // atomic-knob
: tearableValueClass(variable.type) ? INLINE // tearable-knob
: REFERENCE;
The variable.type.bitSize depends on nullability as nullable types may
require more space.
The predicates tooBig and atomicWrite depend on the hardware. As an
example, they could be:
boolean tooBig(int bitSize) {return bitSize > 256;}
boolean atomicWrite(int bitSize) {return bitSize <= 64;}
Table-view of the user-model knobs:
identity ‖ (identity) | value |
zeroness ‖ (no-zero) | (no-zero) | zero
|
atomicity ‖ (atomic) | (atomic) | (atomic)
| tearable |
nullability ‖ (?)| ! | (?) | ! | (?) | ! |
(?)| ! |
==============================================================================================
encoding-mode ‖ reference | inline/reference |
needs reference ‖ everywhere | shared-variables |
no/shared-mutables | no |
definite-assignment ‖ no | yes | no | yes | no | yes
| yes | yes |
default ‖ null | n.a. | null | n.a. | null | n.a.
| n.a. | n.a. |
init-default ‖ null | null | null |
zero/null | null | zero/null |
Notes:
- tokens in parenthesis are the default when no knob is used
- definite-assignment (§16.) means that the compiler enforces (to the
best of its ability) variable initialization before usage
- default is the default-value of a variable when not definitely-assigned
- init-default is the default-value of a variable before any
initialization code runs
- on non-nullable zero value-classes, the init-default (zero or null)
depends on the encoding-mode chosen by the runtime
- on atomic zero value-classes, reference-encoding is needed on
shared-mutables if instance bitSize cannot be written atomically
*** Migration of value-based classes ***
Requiring definite-assignment on all non-nullable shared-mutables is
useful to get rid of missed-initialization-bugs, so I think it's a good
idea to require it wherever source-compatibility allows.
In this model, all value-based classes can be migrated to (atomic) zero
value-classes. Due to definite-assignment, even if LocalDate is migrated
to a zero value-class, it will be hard to get an accidental "Jan 1,
1970". Rational can also be a zero value-class but users will have to
keep in mind that it's possible to get a zero-denominator Rational, even
if the constructor throws when we try to build one.
To maintain source-compatibility, no migrated value-based class can be
tearable, not even Double or Long, since wherever in existing code we
have a field declaration such as:
ValueBasedClass v;
v is always reference encoded and, therefore, atomic. For Double and
Long, this is a bit of an anomaly, because it means that for these two
primitives, and for them alone, each of these pair of field declarations
will not be semantically equivalent:
long v; // tearable
Long! v; // atomic
double d; // tearable
Double! d; // atomic
João Mendonça
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-experts/attachments/20220630/448cb273/attachment-0001.htm>
More information about the valhalla-spec-experts
mailing list