nullable-inlined-values on the heap
João Mendonça
jf.mend at gmail.com
Fri Jul 1 05:02:04 UTC 2022
>
> My comments: this posting feels mostly like a solution without stating
> what problem it is trying to solve, so its pretty hard to comment on.
>
The problem it's trying to solve is to remove the .val and .ref
operators/knobs/concepts from the user-model without any loss of
performance or loss of control over nullability, zeroness or atomicity. In
other words, the objective is to take Kevin's ref-by-default idea one step
further.
In theory, we could construct the union type int|Null, but this type
> doesn't have a practical representation in memory, ...
>
Would it be possible to have a value-class give rise to these 3
hidden/runtime-only companion-types on the heap:
RefType - reference to a value-instance or no-reference (null)
ValType - inlined [value-instance-fields]
ValType? - inlined [nullability-boolean + value-instance-fields]
Then, the runtime could transparently choose between RefType|ValType for
non-nullable variables or between RefType|ValType? for nullable variables,
depending on hardware, bitSize, zeroness and atomicity constraints, as
explained by the ternary expression in my previous email. Of course, since
ValType? has a higher bitSize than ValType, nullable values will be less
likely to be inlined. But still, the point is: could nullable values
sometimes be inlined on the heap as opposed to never being inlined.
In theory, we could construct the union type int|Null, but this (...) drags
> in all sorts of mismatches because union types would then flow throughout
> the system.
>
Is my 3-companion-types solution a real union type? Sure, I am suggesting
two sort-of-unions:
RefType|ValType - for non-nullable value-class variables
RefType|ValType? - for nullable value-class variables
However, to the user, both types in each union represent the same exact
value-set.
On Fri, 1 Jul 2022 at 00:55, Brian Goetz <brian.goetz at oracle.com> wrote:
> From the -comments list.
>
> My comments: this posting feels mostly like a solution without stating
> what problem it is trying to solve, so its pretty hard to comment on. But
> ...
>
> Would it be possible to decomplect nullability from a variable's
> encoding-mode (reference or inline)?
>
>
> Not in reality. A null is fundamentally a *reference* (or the absence of
> a reference.) In theory, we could construct the union type int|Null, but
> this type doesn't have a practical representation in memory, and drags in
> all sorts of mismatches because union types would then flow throughout the
> system. So the only practical way to represent "int or null" is "reference
> to int." Which is to say, Integer (minus identity.)
>
> If this is possible, maybe Valhalla's Java could have a user-model like
> this:
>
>
> You should probably start with what problem you are trying to solve.
>
>
>
>
>
> -------- Forwarded Message --------
> Subject: nullable-inlined-values on the heap
> Date: Thu, 30 Jun 2022 23:02:17 +0100
> From: João Mendonça <jf.mend at gmail.com> <jf.mend at gmail.com>
> To: valhalla-spec-comments at openjdk.org
>
> Hello,
>
>
> Would it be possible to decomplect nullability from a variable's
> encoding-mode (reference or inline)?
>
> I have been looking at the C# spec on "nullable-value-types" and I wonder
> if the Java runtime could do something similar under the hood to allow
> nullable-inlined-values, even on the heap.
> I think that, compared to C#'s "value-types", Java can take advantage of
> the fact that its value-class instances are immutable, which means that
> pass-by-value or pass-by-reference is indistinguishable, which, with
> nullable-inlined-values, could mean that Java can have the variable
> encoding-mode completely encapsulated/hidden from the user-model as a
> runtime implementation detail.
>
> If this is possible, maybe Valhalla's Java could have a user-model like
> this:
>
>
> *** A decomplected user-model ***
>
> For class-authors:
>
> - *value-knob* to reject identity - Applicable on class declarations,
> indicates that the class instances don't require identity (a value-class).
> - *zero-knob* to indicate that the value-class has a zero-value - if a
> value-class does not have a zero-value, its instances won't be inlined in
> any shared-variables (§17.4.1.) since this is the only way for the language
> to ensure the non-existence of the zero-value. If the value-class is
> declared with a zero-value, then care must be taken when reading/writing
> constructors since *no constructor invariant can exclude the zero-value*.
> - *tearable-knob* to allow tearing - Applicable on zero value-class
> declarations with bitSize > 32 bits, may be used by the class-author to
> hand the class-user the responsibility of how to avoid tearing, freeing the
> runtime to always inline instances in shared-mutables (non-final
> shared-variables). Conversely, if this knob is not used, instances will be
> kept atomic, which allows the class-author to guarantee constructor
> invariants *provided they're not broken by the zero-value*, which may be
> useful for the class implementation and class-users to rely upon.
>
> For class-users:
>
> - *not-nullable-knob (!)* to exclude null from a variable's value-set -
> Applicable on any variable declarations. On nullable variables, the default
> value is null and, in either encoding-mode (reference or inline), the
> runtime is free to choose the encoding for the extra bit of information
> required to represent the null state.
> - *atomic-knob* to avoid tearing - Applicable on shared-mutable
> declarations, may be used to reverse the effect of the tearable-knob,
> thereby restoring atomicity.
>
>
> The encoding-mode of a variable is decided at runtime according to this
> ternary expression:
>
> var encodingMode =
> !valueClass(variable.type) ? REFERENCE // value-knob
> : tooBig(variable.type.bitSize) ? REFERENCE
> : !shared(variable) ? INLINE // (§17.4.1.)
> : !zeroValueClass(variable.type) ? REFERENCE // zero-knob
> : final(variable) ? INLINE
> : atomicWrite(variable.type.bitSize) ? INLINE
> : atomic(variable) ? REFERENCE // atomic-knob
> : tearableValueClass(variable.type) ? INLINE // tearable-knob
> : REFERENCE;
>
> The variable.type.bitSize depends on nullability as nullable types may
> require more space.
> The predicates tooBig and atomicWrite depend on the hardware. As an
> example, they could be:
>
> boolean tooBig(int bitSize) {return bitSize > 256;}
> boolean atomicWrite(int bitSize) {return bitSize <= 64;}
>
>
> Table-view of the user-model knobs:
>
> identity ‖ (identity) | value
> |
> zeroness ‖ (no-zero) | (no-zero) |
> zero |
> atomicity ‖ (atomic) | (atomic) | (atomic)
> | tearable |
> nullability ‖ (?) | ! | (?) | ! | (?) | ! |
> (?) | ! |
>
> ==============================================================================================
> encoding-mode ‖ reference | inline/reference
> |
> needs reference ‖ everywhere | shared-variables | no/shared-mutables
> | no |
> definite-assignment ‖ no | yes | no | yes | no | yes
> | yes | yes |
> default ‖ null | n.a. | null | n.a. | null | n.a.
> | n.a. | n.a. |
> init-default ‖ null | null | null | zero/null
> | null | zero/null |
>
> Notes:
> - tokens in parenthesis are the default when no knob is used
> - definite-assignment (§16.) means that the compiler enforces (to the
> best of its ability) variable initialization before usage
> - default is the default-value of a variable when not definitely-assigned
> - init-default is the default-value of a variable before any
> initialization code runs
> - on non-nullable zero value-classes, the init-default (zero or null) depends
> on the encoding-mode chosen by the runtime
> - on atomic zero value-classes, reference-encoding is needed on
> shared-mutables if instance bitSize cannot be written atomically
>
>
> *** Migration of value-based classes ***
>
> Requiring definite-assignment on all non-nullable shared-mutables is
> useful to get rid of missed-initialization-bugs, so I think it's a good
> idea to require it wherever source-compatibility allows.
> In this model, all value-based classes can be migrated to (atomic) zero
> value-classes. Due to definite-assignment, even if LocalDate is migrated to
> a zero value-class, it will be hard to get an accidental "Jan 1, 1970".
> Rational can also be a zero value-class but users will have to keep in mind
> that it's possible to get a zero-denominator Rational, even if the
> constructor throws when we try to build one.
> To maintain source-compatibility, no migrated value-based class can be
> tearable, not even Double or Long, since wherever in existing code we have
> a field declaration such as:
>
> ValueBasedClass v;
>
> v is always reference encoded and, therefore, atomic. For Double and Long,
> this is a bit of an anomaly, because it means that for these two
> primitives, and for them alone, each of these pair of field declarations
> will not be semantically equivalent:
>
> long v; // tearable
> Long! v; // atomic
>
> double d; // tearable
> Double! d; // atomic
>
>
> João Mendonça
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-comments/attachments/20220701/a4e6860a/attachment-0001.htm>
More information about the valhalla-spec-comments
mailing list