Nullness markers to enable flattening
Remi Forax
forax at univ-mlv.fr
Wed Feb 8 14:27:04 UTC 2023
> The goal is a general-purpose feature that
> lets programmers express intent about nulls, and that is preserved at runtime
> sufficiently for JVMs to observe that "not null" + "value class" + "non-atomic
> (or compact) class" --> "maximally flattenable storage". There are no "value
> types", and there is no direct control over flattenability.
I would say that "not null" + "zero default value class" is necessary for flattening.
Then "non-atomic class" can help, but there is no guarantee.
> - Nullness is generally enforced at run time, via cooperation between javac and
> JVMs. Methods with null-free parameters can be expected to throw if a null is
> passed in. Null-free storage should reject writes of nulls. (Details to be
> worked out, but as a starting point, imagine 'Q'-typed storage for all types.
> Writes reject nulls. Reads before any writes produce a default value, or if
> none exists, throw.)
The last sentence worry me a little, we have types that does not have a default value ??
Are you referencing to the use of 'void' (or any other sentinel) to describe a field that does not exist for a specific parametrization ?
---
So we have 3 locations were we need to take care of null-free types:
- null-free fields
- null-free method parameters
- null-free array allocation
In all cases, we have the choice between being handled by the VM or by javac. The less the VM does, the more compatible we are but it can be at the expense of some optimizations.
For null-free fields, given that we need to traps all reads, javac can not help us here, there are already existing codes that access fields that will becomes null free.
So the VM has to be involve here. If we use TypeRestriction here, we have the bonus of using the same construct for both parametrization and null-free enforcement.
But we may want to use a simpler mechanism first.
For null-free method parameters, there a two sub-items, null-free parameters and null-free return type.
For the former, javac can insert the corresponding Object.requireNonNull() and either have a specific new attribute or re-use Signature to handle separate compilation.
The problem with that is that the VM can not trust such attribute and as a result may have to box all null-free zero-default value. Otherwise, it means that we also need TypeRestriction at method level.
For the later, null-free return type, if can choose javac to insert the requireNonNull at call sites (like with generics), this require to recompile the call sites to see the effect or introduce a requireNonNull() before each "areturn" instruction.
For null-free array allocation, using the VM does not seem to have any benefit comparing to use a specific static method like Array.newInstance() (I believe it's better to introduce a variation of Array.newInstance given that we want to return an array of Object and not an Object). The secondary type of a value type can be get using a constant dynamic.
Rémi
----- Original Message -----
> From: "daniel smith" <daniel.smith at oracle.com>
> To: "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Tuesday, February 7, 2023 2:26:42 AM
> Subject: Nullness markers to enable flattening
> A quick review:
>
> The Value Objects feature (see https://openjdk.org/jeps/8277163) captures the
> Valhalla project's central idea: that objects don't have to have identity, and
> if programmers opt out of identity, JVMs can provide optimizations comparable
> to primitive performance.
>
> However, one important implementation technique is not supported by that JEP:
> maximally flattened heap storage. ("Maximally flattened" as in "just the bits
> necessary to encode an instance".) This is because flattened fields and arrays
> store an object's field values directly, and so 1) need to be initialized "at
> birth" to a non-null class instance, 2) may not store null, and 3) may by
> updated non-atomically. These are semantics that need to be surfaced in the
> language model.
>
> We've tackled (3) by allowing value classes to be declared non-atomic
> (syntax/limitations subject to bikeshedding), and then claiming by fiat that
> fields/arrays of such classes are tearing risks. Races are rare enough that
> this doesn't really call for a use-site opt-in, and we don't necessarily need
> any deeper explanation for how new objects derived from random combinations of
> old objects can be created by a read operation. That's just how it works.
> <shrug>
>
> We also allow value classes to declare that they support an all-zeros default
> instance (again, subject to bikeshedding). You could imagine similarly claiming
> that fields/arrays of these classes are null-hostile, as a side effect of how
> their storage works. But this is an idiosyncrasy that is going to affect a lot
> more programmers, and "that's just how it works" is pretty unsatisfactory.
> Sometimes programs count on being able to use 'null' in their computation. We
> need something in the language model to let programs opt in/out of nulls at the
> use site, and thus opt out/in of maximally flattenable heap storage.
>
> We've long discussed "reference type" vs. "value type" as the language concept
> that captures this distinction. But where we once had a long list of
> differences between references and values, most of those have gone away.
> Notably, it's *not* useful for performance intuitions to imagine that
> references are pointers and values are inline. Value objects get inlined when
> the JVM want to do so. Reference-ness is not relevant.
>
> Really, for most programmers, nullness is all that distinguishes a "reference
> type" from a "value type".
>
> Meanwhile, expressing nullness is not a problem unique to Valhalla. Whether a
> variable is meant to store nulls is probably the most important property of
> most programs that isn't expressible in the language. Workarounds include
> informal javadoc specifications, type annotations (as explored by JSpecify),
> lots of 'Objects.requireNonNull' calls, and blanket "if you pass in a null, you
> might get an NPE" policies.
>
> In Amber, pattern matching has its own problems with nullness: there are a lot
> of ad hoc rules to distinguish between "is this a non-null instance of class
> Foo?" vs. "is this null *or* an instance of class Foo?", because there's no
> good way to express those two queries as explicitly different.
>
> ---
>
> To address these problems, we've been exploring nullness markers as an
> alternative to '.val' and '.ref'. The goal is a general-purpose feature that
> lets programmers express intent about nulls, and that is preserved at runtime
> sufficiently for JVMs to observe that "not null" + "value class" + "non-atomic
> (or compact) class" --> "maximally flattenable storage". There are no "value
> types", and there is no direct control over flattenability.
>
> (A lot of these ideas build on what JSpecify has done, so appreciation to them
> for the good work and useful documentation.)
>
> Some key ideas:
>
> - Nullness is an *optional* property of variables/expressions/etc., distinct
> from types. If the program doesn't say what kind of nullness a variable has,
> and it can't be inferred, the nullness is "unspecified". (Interpreted as "might
> be null, but the programmer hasn't told us if that's their intent".)
> Variables/expressions with unspecified nullness continue to behave the way they
> always have.
>
> - Because nullness is distinct from types, it shouldn't impact type checking
> rules, subtyping, overriding, conversions, etc. Nullness has its own analysis,
> subject to its own errors/warnings. The precise error/warning conditions
> haven't been fleshed out, but our bias is towards minimal intrusion—we don't
> want to make it hard to adopt these features in targeted ways.
>
> - That said, *type expressions* (the syntax in programs that expresses a type)
> are closely intertwined with *nullness markers*. 'Foo!' refers to a non-null
> Foo, and 'Foo?' refers to a Foo or null. And nullness is an optional property
> of type arguments, type variable bounds, and array components. Nullness markers
> are the way programmers express their intent to the compiler's nullness
> analysis.
>
> - Nullness may also be implicit. Catch parameters and pattern variables are
> always non-null. Lots of expressions have '!' nullness, and the null literal
> has '?' nullness. Local variables get their nullness from their initializers.
> Control flow analysis can infer properties of a variable based on its uses.
>
> - There are features that change the default interpretation of the nullness of
> class names. This is still pretty open-ended. Perhaps certain classes can be
> declared (explicitly or implicitly) null-free by default (e.g., 'Point' is
> implicitly 'Point!'). Perhaps a compilation-unit- or module- level directive
> says that all unadorned types should be interpreted as '!'. Programs can be
> usefully written without these convenience features, but for programmers who
> want to widely adopt nullness, it will be important to get away from
> "unspecified" as the default everywhere.
>
> - Nullness is generally enforced at run time, via cooperation between javac and
> JVMs. Methods with null-free parameters can be expected to throw if a null is
> passed in. Null-free storage should reject writes of nulls. (Details to be
> worked out, but as a starting point, imagine 'Q'-typed storage for all types.
> Writes reject nulls. Reads before any writes produce a default value, or if
> none exists, throw.)
>
> - Type variable types have nullness, too. Besides 'T!' and 'T?', there's also a
> "parametric" 'T*' that represents "whatever nullness is provided by the type
> argument". (Again, some room for choosing the default interpretation of bare
> 'T'; unspecified nullness is useful for type variables as well.) Nullness of
> type arguments is inferred along with the types; when both a type argument and
> its bound have nullness, bounds checks are based on '!' <: '*' <: '?'. Generics
> are erased for now, but in the future '!' type arguments will be reified, and
> specialized classes will provide the expected runtime behaviors.
>
> There are, of course, a lot of details behind these points. But hopefully this
> provides a good high-level introduction.
>
> A worry in taking on extra features like this is that we'll get distracted from
> our primary goal, which is to support maximally flattened storage of value
> objects. But I think it feels manageable, and it's certainly a lot more useful
> than the sort of targeted usage of '.val' we were thinking about before.
>
> Our main tasks for delivering a feature include:
> - Work out the declaration syntax/class file encoding for opting in to
> non-atomic-ness and default instances
> - Implement nullness markers and some analysis/diagnostics in javac
> - Provide a language spec for the parts of the analysis standardized in the
> language
> - Settle on a class file format and division of responsibility for runtime
> behaviors
> - Implement some targeted new JVM behaviors; use nullness as a signal for
> flattening
> - Design/implement how nullness is exposed by reflection
>
> For the future, we'll want to:
> - Anticipate how a "change the defaults" feature will work
> - Consider the interaction of nullness with Amber features
> - Think about how runtime nullness interacts with specialization and type
> restrictions
More information about the valhalla-spec-observers
mailing list