JVM alternatives for supporting nullable value types

Thu Sep 13 01:26:40 UTC 2018

Thanks Daniel !

There is another variation of the last semantics, instead of allowing the type notation on method parameters, you have a boolean in the attribute ValueTypes that indicates if the value type is nullable or not. So you are only allowed to decide class wide is a value type is nullable or not, in term of syntax it's equivalent of allowing !, ? and - only when importing the type (i know that you do not have to import a type, it's to explain how it works).

Nullability notations on types (using class wide side notations)
------------------------------

In this approach, we use regular L types to represent value types, and these types are nullable by default. To indicate that a particular field, array, or parameter/return is null-free, some form of side notation is used in the ValueTypes attribute.

JVM implications

- use the attribute ValueTypes to encode nullable-ness notations
- The default value of a field/array depends on whether the "null-free" notation is used
- Fields, arrays, and method parameters and returns that are null-free can be flattened/scalarized
- Stack variables may generally be null, unless a static analysis proves otherwise
- A putfield, putstatic, aastore, or method invocation may fail with an NPE (or maybe ASE)
- Method overriding allows nullability mismatches; calls must be able to dynamically adapt (e.g., through multiple v-table entries and VM-generated bridges)
- Types marked null-free are allowed to be loaded early (e.g., to decide on field layout)

Compilation strategy

Where '*' represents a side notation that a type is null-free:

Val? maps to LVal;
Val~ maps to LVal;
Val! maps to LVal;*

Nullability conversions are no-ops; null-free conversions are either compiled to explicit null checks or are implicit in a invoke*/getfield/putfield.

Language implications

- Null-free value types typically get flattened storage and scalarized invocations
- Array store runtime checks may include a null check
- Methods may not be overloaded on different nullabilities of the same type
- Pollution of null-free variables arrays, or parameters/returns is impossible
- A conversion from Val~[] to Val![] could be supported, but the result would not perform the expected runtime checks

Migration implications

- Refactoring a class to be a value class is a binary compatible change (except where this involves incompatible changes like removing a public constructor); before recompilation, treatment of nulls does not change
- Changing the nullability of a type is a binary compatible change; library clients who expect a nullable API may see surprising NPEs or ASEs

Rémi

----- Mail original -----
> De: "daniel smith" <daniel.smith at oracle.com>
> À: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Envoyé: Jeudi 13 Septembre 2018 01:46:31
> Objet: JVM alternatives for supporting nullable value types

> For LW10, one of our goals is to support interactions between value types and
> erased generics by having some form of a nullable value type.
> 
> The needs of the language factor heavily into the JVM design. We're not ready to
> commit to language-level details, but it's likely that the language will
> support nullable and non-nullable variations of the types declared by value
> classes; and these variations will probably be supported in most places that
> types can appear.
> 
> More generally, the language may support up to three different flavors of
> nullability on some or all types:
> - null-free: a type that does not include null (could be spelled Foo!)
> - null-permitting: a type that allows but ignores nulls (could be spelled Foo~)
> - null-checked: a type that allows and checks for nulls (could be spelled Foo?)
> 
> (Please note that this is placeholder syntax. There are lots of ways to map this
> to real syntax. Unadorned names will map to one of these; it's possible that
> migrating a class to be a value class will change the interpretation of its
> unadorned name.)
> 
> Null-permitting and null-checked types are both "nullable"; the difference is in
> how strongly the compiler enforces null checks. ("Null-permitting" is the
> existing behavior for types like 'String'; "null-checked" is the style that
> requires proof that nulls are absent before dereferencing.)
> 
> The other important concept from the language is conversions:
> - A widening conversion (or something similar) supports treating a value of a
> null-free type as null-permitting or null-checked
> - A "null-free conversion" is required to go in the opposite direction, and
> includes a runtime null check
> - A "nullability conversion", like an unchecked conversion, might allow other
> forms of conversions between types involving different nullabilities, including
> in their type arguments or array component type.
> 
> Turning to the JVM with those language-level concepts in mind, I've put together
> the following summary of four main designs we've considered. The goal here is
> not to reach a conclusion about which path is best, but to make sure we're
> accurately considering all of the implications in each case.
> 
> 
> Nullable value types, null-free storage
> ---------------------------------------
> 
> In this approach, we use regular L types to represent value types, and these
> types are nullable. Fields and arrays, via some sort of modifier, may choose to
> be nullable or null-free.
> 
> 
> JVM implications
> 
> - Need a mechanism (new opcode?) to indicate that an array allocation is
> null-free
> - The default value of a field/array depends on whether the "null-free" modifier
> is used
> - Fields and arrays that are marked null-free can, of course, be flattened
> - Stack variables and method parameters/returns may always be null
> - A putfield, putstatic, or aastore may fail with an NPE (or maybe ASE)
> - JIT can optimistically assume no nulls and scalarize, but must check and
> de-opt when a null is encountered
> - The "null-free" modifier is only allowed with value class types, and must be
> validated early (e.g., to decide on field layout)
> 
> 
> Compilation strategy
> 
> Val? maps to LVal;
> Val~ maps to LVal;
> Val! maps to LVal;
> 
> The nullability of the type in a field declaration or array creation expression
> determines whether the "null-free" modifier is used or not.
> 
> Nullability conversions are no-ops; null-free conversions are either compiled to
> explicit null checks or are implicit in a invoke*/getfield/putfield.
> 
> 
> Language implications
> 
> - Null-free value types typically get flattened storage and scalarized
> invocations
> - Array store runtime checks may include a null check
> - Methods may not be overloaded on different nullabilities of the same type
> - Null-free parameters/returns may be polluted with nulls due to inconsistent
> compilation or non-Java interop—detected with an NPE on storage or dereference
> - A conversion from Val~[] to Val![] could be supported, but the result would
> not perform the expected runtime checks
> 
> 
> Migration implications
> 
> - Refactoring a class to be a value class is a binary compatible change (except
> where this involves incompatible changes like removing a public constructor);
> before recompilation (which may reinterpret some unadorned names), treatment of
> nulls does not change
> - Changing the nullability of a type is a binary compatible change; library
> clients who expect nullable storage may see surprising NPEs or ASEs
> 
> 
> 
> Always null-free value types
> ----------------------------
> 
> In this approach, we use regular L types to represent value types, and these
> types are null-free. Non-value L types continue to be nullable. A use-site
> attribute tracks which class names represent value classes; validation lazily
> ensures consistency with the declaration.
> 
> 
> JVM implications
> 
> - Fields, arrays, and method parameters and returns with value class types can
> be flattened/scalarized
> - The 'null' verification type is not a subtype of any value class types
> - Casts to value class types must fail on 'null' (CCE or NPE)
> - At method preparation, field/method resolution, and class loading, a check
> similar to class loader constraints ensures that classes agree on value classes
> in the descriptor
> - Various other vectors for getting data into the JVM should prevent nulls, or
> have contracts that allow crashing, etc., if data is corrupted
> - Classes in the value classes attribute are allowed to be loaded early (e.g.,
> to decide on field layout)
> - If the value classes attribute does not mention a value class, it's possible
> for variables/fields of that type to be null, but an error will occur when an
> attempt is made to load the class or resolve against a class that disagrees
> 
> 
> Compilation strategy
> 
> Val? maps to Ljava/lang/Object;
> Val~ maps to Ljava/lang/Object;
> Val! maps to LVal;
> 
> Every referenced value class is listed in the value classes attribute.
> 
> Nullability conversions are no-ops; null-free conversions are compiled to
> checkcasts (even for member access). Casts that target Val?/Val~ compile to a
> checkcast guarded by a null check, where null always succeeds.
> 
> 
> Language implications
> 
> - Null-free value types typically get flattened storage and scalarized
> invocations
> - Array store runtime checks may include a null check
> - Val~[] and Val?[] do not perform array store checks at all—any Object may end
> up polluting these arrays (creating arrays of these types might be treated as
> an error, like T[])
> - Val~ and Val? are overloading-hostile: their use in signatures conflicts with
> Object and all other null-permitting/null-checked value types
> - Null-permitting/null-checked value type parameters and returns may be polluted
> with other types due to inconsistent compilation or non-Java interop—detected
> with a CCE on null-free conversion
> - A conversion from Val~[] to Val![] cannot be allowed
> 
> 
> Migration implications
> 
> - Refactoring a class to be a value class is a binary incompatible change due to
> inconsistent value class attributes
> - Changing from a null-permitting/null-checked to null-free type (or vice versa)
> is a binary incompatible change unless there's some form of support for type
> migrations
> 
> 
> 
> Null-free types with new descriptors
> ------------------------------------
> 
> In this approach, we use regular L types to represent nullable value types, and
> introduce other types (spelled, say, with a "K") to represent null-free value
> types. K types are subtypes of L types, and casts can be used to convert from L
> to K.
> 
> 
> JVM implications
> 
> - Descriptor syntax needs to support 'K'
> - To support K casts, we need ClassRefs that indicate K-ness, a new opcode, or
> some other mechanism
> - Fields, arrays, and method parameters and returns with K types can be
> flattened/scalarized
> - The 'null' verification type is not a subtype of K types
> - Casts to K types must fail on 'null'
> - Various other vectors for getting data into the JVM should prevent nulls, or
> have contracts that allow crashing, etc., if data is corrupted
> - Classes named by K types are allowed to be loaded early (e.g., to decide on
> field layout)
> 
> 
> Compilation strategy
> 
> Val? maps to LVal;
> Val~ maps to LVal;
> Val! maps to KVal;
> 
> Nullability conversions are no-ops; null-free conversions are either compiled to
> explicit casts or are implicit in an invoke*/getfield/putfield.
> 
> 
> Language implications
> 
> - Null-free value types typically get flattened storage and scalarized
> invocations
> - Array store runtime checks may include a null check
> - Methods may be overloaded with a null-free type vs. a
> null-permitting/null-checked type (but null-permitting vs. null-checked is not
> allowed)
> - Pollution of null-free variables or arrays is impossible
> - A conversion from Val~[] to Val![] cannot be allowed
> 
> 
> Migration implications
> 
> - Refactoring a class to be a value class is a binary compatible change (except
> where this involves incompatible changes like removing a public constructor);
> before recompilation (which may reinterpret some unadorned names), treatment of
> nulls does not change
> - Changing from a null-permitting/null-checked to null-free type (or vice
> versa), is a binary incompatible change unless there's some form of support for
> type migrations
> 
> 
> 
> Nullability notations on types
> ------------------------------
> 
> In this approach, we use regular L types to represent value types, and these
> types are nullable by default. To indicate that a particular field, array, or
> parameter/return is null-free, some form of side notation is used.
> (Deliberately using the word "notation" rather than "annotation" or "modifier"
> here to avoid committing to an encoding.)
> 
> This is similar to "nullable value types, null-free storage", except that the
> null-free notation can be used on method parameters/returns.
> 
> This is similar to "always null-free value types", except that instead of
> tracking value classes in each class file, we track null-free value types per
> use site.
> 
> This is similar to "null-free types with new descriptors", except that the
> notations are not part of descriptors and don't require any explicit
> conversions—they are not part of the verification type system.
> 
> 
> JVM implications
> 
> - Need a mechanism to encode notations, both for descriptors and for array
> creations
> - The default value of a field/array depends on whether the "null-free" notation
> is used
> - Fields, arrays, and method parameters and returns that are marked null-free
> can be flattened/scalarized
> - Stack variables may generally be null, unless a static analysis proves
> otherwise
> - A putfield, putstatic, aastore, or method invocation may fail with an NPE (or
> maybe ASE)
> - Method overriding allows nullability mismatches; calls must be able to
> dynamically adapt (e.g., through multiple v-table entries and VM-generated
> bridges)
> - Types marked null-free are allowed to be loaded early (e.g., to decide on
> field layout)
> 
> 
> Compilation strategy
> 
> Where '*' represents a side notation that a type is null-free:
> 
> Val? maps to LVal;
> Val~ maps to LVal;
> Val! maps to LVal;*
> 
> Nullability conversions are no-ops; null-free conversions are either compiled to
> explicit null checks or are implicit in a invoke*/getfield/putfield.
> 
> 
> Language implications
> 
> - Null-free value types typically get flattened storage and scalarized
> invocations
> - Array store runtime checks may include a null check
> - Methods may not be overloaded on different nullabilities of the same type
> - Pollution of null-free variables arrays, or parameters/returns is impossible
> - A conversion from Val~[] to Val![] could be supported, but the result would
> not perform the expected runtime checks
> 
> 
> Migration implications
> 
> 
> - Refactoring a class to be a value class is a binary compatible change (except
> where this involves incompatible changes like removing a public constructor);
> before recompilation, treatment of nulls does not change
> - Changing the nullability of a type is a binary compatible change; library
> clients who expect a nullable API may see surprising NPEs or ASEs