From forax at univ-mlv.fr Mon Feb 6 09:26:31 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 6 Feb 2023 10:26:31 +0100 (CET) Subject: Valhalla super class/interfaces and bridges Message-ID: <870810805.12807573.1675675591174.JavaMail.zimbra@u-pem.fr> Hello all, I'm trying to implement the parametric VM spec and i think there is an issue with the compatibility of classes that contains a bridges. But first, the issue, because of separate compilation, it's possible to have one class implementing the same interface parametrized with different parameters. By example, those classes are compiled together class A implements I, J {} interface I {} interface J {} Then 'I' or 'J' are modified and re-compiled later interface I extends Comparable {} interface J extends Comparable {} and 'A' can be re-compiled to see either 'I' or 'J' modified. If 'I' is modified and 'A' is re-compiled, 'A' contains a bridge compareTo(Object) -> compareTo(String) and if 'J' is modified and 'A' is re-compiled, 'A' contains a bridge compareTo(Object) -> compareTo(Integer). In fact, a bridge is a witness of a specialization of the interface at the time of compilation. There are two issues: - a class can have the same interface with several different specializations, obviously, if the code was compiled with everything together, it will not compile, but due to separate compilation, this is an issue. - a class can contain a bridge even if the interface it depends on is not specialized anymore. The questions is what we should do in those cases, given that i think that we should not try to have a different semantics than the existing one. By example, throwing an error at runtime if there are two specializations of the same interface seems a bad idea. One interesting idea is that if the compiler generates a bridge, it should also record the specialization that generates that bridge in a side table, so the decision to generate a bridge at compile time is available for the parametric VM at runtime. regards, R?mi From daniel.smith at oracle.com Tue Feb 7 01:26:42 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Tue, 7 Feb 2023 01:26:42 +0000 Subject: Nullness markers to enable flattening Message-ID: A quick review: The Value Objects feature (see https://openjdk.org/jeps/8277163) captures the Valhalla project's central idea: that objects don't have to have identity, and if programmers opt out of identity, JVMs can provide optimizations comparable to primitive performance. However, one important implementation technique is not supported by that JEP: maximally flattened heap storage. ("Maximally flattened" as in "just the bits necessary to encode an instance".) This is because flattened fields and arrays store an object's field values directly, and so 1) need to be initialized "at birth" to a non-null class instance, 2) may not store null, and 3) may by updated non-atomically. These are semantics that need to be surfaced in the language model. We've tackled (3) by allowing value classes to be declared non-atomic (syntax/limitations subject to bikeshedding), and then claiming by fiat that fields/arrays of such classes are tearing risks. Races are rare enough that this doesn't really call for a use-site opt-in, and we don't necessarily need any deeper explanation for how new objects derived from random combinations of old objects can be created by a read operation. That's just how it works. We also allow value classes to declare that they support an all-zeros default instance (again, subject to bikeshedding). You could imagine similarly claiming that fields/arrays of these classes are null-hostile, as a side effect of how their storage works. But this is an idiosyncrasy that is going to affect a lot more programmers, and "that's just how it works" is pretty unsatisfactory. Sometimes programs count on being able to use 'null' in their computation. We need something in the language model to let programs opt in/out of nulls at the use site, and thus opt out/in of maximally flattenable heap storage. We've long discussed "reference type" vs. "value type" as the language concept that captures this distinction. But where we once had a long list of differences between references and values, most of those have gone away. Notably, it's *not* useful for performance intuitions to imagine that references are pointers and values are inline. Value objects get inlined when the JVM want to do so. Reference-ness is not relevant. Really, for most programmers, nullness is all that distinguishes a "reference type" from a "value type". Meanwhile, expressing nullness is not a problem unique to Valhalla. Whether a variable is meant to store nulls is probably the most important property of most programs that isn't expressible in the language. Workarounds include informal javadoc specifications, type annotations (as explored by JSpecify), lots of 'Objects.requireNonNull' calls, and blanket "if you pass in a null, you might get an NPE" policies. In Amber, pattern matching has its own problems with nullness: there are a lot of ad hoc rules to distinguish between "is this a non-null instance of class Foo?" vs. "is this null *or* an instance of class Foo?", because there's no good way to express those two queries as explicitly different. --- To address these problems, we've been exploring nullness markers as an alternative to '.val' and '.ref'. The goal is a general-purpose feature that lets programmers express intent about nulls, and that is preserved at runtime sufficiently for JVMs to observe that "not null" + "value class" + "non-atomic (or compact) class" --> "maximally flattenable storage". There are no "value types", and there is no direct control over flattenability. (A lot of these ideas build on what JSpecify has done, so appreciation to them for the good work and useful documentation.) Some key ideas: - Nullness is an *optional* property of variables/expressions/etc., distinct from types. If the program doesn't say what kind of nullness a variable has, and it can't be inferred, the nullness is "unspecified". (Interpreted as "might be null, but the programmer hasn't told us if that's their intent".) Variables/expressions with unspecified nullness continue to behave the way they always have. - Because nullness is distinct from types, it shouldn't impact type checking rules, subtyping, overriding, conversions, etc. Nullness has its own analysis, subject to its own errors/warnings. The precise error/warning conditions haven't been fleshed out, but our bias is towards minimal intrusion?we don't want to make it hard to adopt these features in targeted ways. - That said, *type expressions* (the syntax in programs that expresses a type) are closely intertwined with *nullness markers*. 'Foo!' refers to a non-null Foo, and 'Foo?' refers to a Foo or null. And nullness is an optional property of type arguments, type variable bounds, and array components. Nullness markers are the way programmers express their intent to the compiler's nullness analysis. - Nullness may also be implicit. Catch parameters and pattern variables are always non-null. Lots of expressions have '!' nullness, and the null literal has '?' nullness. Local variables get their nullness from their initializers. Control flow analysis can infer properties of a variable based on its uses. - There are features that change the default interpretation of the nullness of class names. This is still pretty open-ended. Perhaps certain classes can be declared (explicitly or implicitly) null-free by default (e.g., 'Point' is implicitly 'Point!'). Perhaps a compilation-unit- or module- level directive says that all unadorned types should be interpreted as '!'. Programs can be usefully written without these convenience features, but for programmers who want to widely adopt nullness, it will be important to get away from "unspecified" as the default everywhere. - Nullness is generally enforced at run time, via cooperation between javac and JVMs. Methods with null-free parameters can be expected to throw if a null is passed in. Null-free storage should reject writes of nulls. (Details to be worked out, but as a starting point, imagine 'Q'-typed storage for all types. Writes reject nulls. Reads before any writes produce a default value, or if none exists, throw.) - Type variable types have nullness, too. Besides 'T!' and 'T?', there's also a "parametric" 'T*' that represents "whatever nullness is provided by the type argument". (Again, some room for choosing the default interpretation of bare 'T'; unspecified nullness is useful for type variables as well.) Nullness of type arguments is inferred along with the types; when both a type argument and its bound have nullness, bounds checks are based on '!' <: '*' <: '?'. Generics are erased for now, but in the future '!' type arguments will be reified, and specialized classes will provide the expected runtime behaviors. There are, of course, a lot of details behind these points. But hopefully this provides a good high-level introduction. A worry in taking on extra features like this is that we'll get distracted from our primary goal, which is to support maximally flattened storage of value objects. But I think it feels manageable, and it's certainly a lot more useful than the sort of targeted usage of '.val' we were thinking about before. Our main tasks for delivering a feature include: - Work out the declaration syntax/class file encoding for opting in to non-atomic-ness and default instances - Implement nullness markers and some analysis/diagnostics in javac - Provide a language spec for the parts of the analysis standardized in the language - Settle on a class file format and division of responsibility for runtime behaviors - Implement some targeted new JVM behaviors; use nullness as a signal for flattening - Design/implement how nullness is exposed by reflection For the future, we'll want to: - Anticipate how a "change the defaults" feature will work - Consider the interaction of nullness with Amber features - Think about how runtime nullness interacts with specialization and type restrictions From kevinb at google.com Tue Feb 7 18:50:08 2023 From: kevinb at google.com (Kevin Bourrillion) Date: Tue, 7 Feb 2023 10:50:08 -0800 Subject: Nullness markers to enable flattening In-Reply-To: References: Message-ID: As the person with a foot in both Valhalla and JSpecify camps, and who is obsessed with keeping Java as conceptually simple as it can be, as long as this all works out I'll consider it to be a major win. On Mon, Feb 6, 2023 at 5:26 PM Dan Smith wrote: - Because nullness is distinct from types, it shouldn't impact type > checking rules, subtyping, overriding, conversions, etc. Nullness has its > own analysis, subject to its own errors/warnings. The precise error/warning > conditions haven't been fleshed out, but our bias is towards minimal > intrusion?we don't want to make it hard to adopt these features in targeted > ways. > (The points below are general, but when the type in question is flattenable it likely makes sense to enforce more strongly.) I think this is a very salient point: there is a trade-off between enforcing the new nullness information as well as we might like vs. the adoptability of the feature. To arrive eventually at a null-safe world, the most importantly early step is for as many APIs as possible to provide the information. If they worry about breaking clients already, they won't be able to do it; if that means javac being lenient on enforcement, that's okay! Third-party nullness analysis tools have their role to play for a very long time anyway, which is a fine thing. Really, the "worst" thing about these tools today is simply "ugh, the type annotations are too bulky", and there's value even in solving *just that*. In fact, the spec being stringent about which cases definitely produce a warning might even be counterproductive. Third-party analyzers will analyze more deeply and take advantage of wider sources of information, and the typical effect is to recognize that a certain expression *can't* be null when it otherwise seemed like it could. That means the deeper analysis would end up wanting to *subtract* warnings from javac, which of course it can't do (currently has no way to do?). (Again, this applies less to the value types that are the actual motivating use case here.) - That said, *type expressions* (the syntax in programs that expresses a > type) are closely intertwined with *nullness markers*. 'Foo!' refers to a > non-null Foo, and 'Foo?' refers to a Foo or null. And nullness is an > optional property of type arguments, type variable bounds, and array > components. Nullness markers are the way programmers express their intent > to the compiler's nullness analysis. > Altogether, I think you're saying that this nullness attribute is attached to *every* static type recognized by the compiler: types of expressions, of expression contexts, and the (usually explicit) type usages you mention. Basically I would expect that most programmers most of the time will be well-served (enough) to think of this attribute as *if* it's an intrinsic part of the type. It's sort of the exact illusion this is all trying to create. (One mild exception is that it is possibly counterproductive to enforce override strict parameter agreement in the nullness attribute; as far as we can tell, allowing a nullable parameter to override a non-null one is nearly harmless and we think makes for a better migration story.) - There are features that change the default interpretation of the nullness > of class names. This is still pretty open-ended. Perhaps certain classes > can be declared (explicitly or implicitly) null-free by default (e.g., > 'Point' is implicitly 'Point!'). Perhaps a compilation-unit- or module- > level directive says that all unadorned types should be interpreted as '!'. > Programs can be usefully written without these convenience features, but > for programmers who want to widely adopt nullness, it will be important to > get away from "unspecified" as the default everywhere. > If a release has the rest of it but not this part, users will find it extremely annoying, but it might be survivable if we can see the other side of the chasm from there. It's definitely essential long-term. Of course, TypeScript, Scala, C#, and Dart all do it this way, through either a configuration file or compiler flag etc. Java's analogue is module-info, but yes, I do think adoptability would benefit *very* much from a class- or compilation-unit level modifier as well. And (sigh) I guess package-info, even though when that gets used outside a module it's kind of a disaster. - Nullness is generally enforced at run time, via cooperation between javac > and JVMs. Methods with null-free parameters can be expected to throw if a > null is passed in. Null-free storage should reject writes of nulls. > (Details to be worked out, but as a starting point, imagine 'Q'-typed > storage for all types. Writes reject nulls. Reads before any writes produce > a default value, or if none exists, throw.) > This raises some questions for the scenarios where we are compelled to write `@SuppressWarnings("nullness")`; that information obviously isn't available at runtime. Perhaps there would need to be a more targeted suppression mechanism? - Type variable types have nullness, too. Besides 'T!' and 'T?', there's > also a "parametric" 'T*' that represents "whatever nullness is provided by > the type argument". (Again, some room for choosing the default > interpretation of bare 'T'; unspecified nullness is useful for type > variables as well.) Nullness of type arguments is inferred along with the > types; when both a type argument and its bound have nullness, bounds checks > are based on '!' <: '*' <: '?'. Generics are erased for now, but in the > future '!' type arguments will be reified, and specialized classes will > provide the expected runtime behaviors. > I believe you will need the distinction between *unspecified* T and *parametric* T. This is the second reason why JSpecify needs `@NullMarked `; it isn't just "`!` by default"; it's also "type-variable usages become parametric instead of unspecified". Again, what I believe matters is that users have some way to express the distinction, not how much or little javac and the runtime do out of the gates to enforce it. In summary, I'm extremely glad we're thinking along these lines, and I hope to be helpful. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Feb 8 14:27:04 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 8 Feb 2023 15:27:04 +0100 (CET) Subject: Nullness markers to enable flattening In-Reply-To: References: Message-ID: <1255107003.15181425.1675866424159.JavaMail.zimbra@u-pem.fr> > The goal is a general-purpose feature that > lets programmers express intent about nulls, and that is preserved at runtime > sufficiently for JVMs to observe that "not null" + "value class" + "non-atomic > (or compact) class" --> "maximally flattenable storage". There are no "value > types", and there is no direct control over flattenability. I would say that "not null" + "zero default value class" is necessary for flattening. Then "non-atomic class" can help, but there is no guarantee. > - Nullness is generally enforced at run time, via cooperation between javac and > JVMs. Methods with null-free parameters can be expected to throw if a null is > passed in. Null-free storage should reject writes of nulls. (Details to be > worked out, but as a starting point, imagine 'Q'-typed storage for all types. > Writes reject nulls. Reads before any writes produce a default value, or if > none exists, throw.) The last sentence worry me a little, we have types that does not have a default value ?? Are you referencing to the use of 'void' (or any other sentinel) to describe a field that does not exist for a specific parametrization ? --- So we have 3 locations were we need to take care of null-free types: - null-free fields - null-free method parameters - null-free array allocation In all cases, we have the choice between being handled by the VM or by javac. The less the VM does, the more compatible we are but it can be at the expense of some optimizations. For null-free fields, given that we need to traps all reads, javac can not help us here, there are already existing codes that access fields that will becomes null free. So the VM has to be involve here. If we use TypeRestriction here, we have the bonus of using the same construct for both parametrization and null-free enforcement. But we may want to use a simpler mechanism first. For null-free method parameters, there a two sub-items, null-free parameters and null-free return type. For the former, javac can insert the corresponding Object.requireNonNull() and either have a specific new attribute or re-use Signature to handle separate compilation. The problem with that is that the VM can not trust such attribute and as a result may have to box all null-free zero-default value. Otherwise, it means that we also need TypeRestriction at method level. For the later, null-free return type, if can choose javac to insert the requireNonNull at call sites (like with generics), this require to recompile the call sites to see the effect or introduce a requireNonNull() before each "areturn" instruction. For null-free array allocation, using the VM does not seem to have any benefit comparing to use a specific static method like Array.newInstance() (I believe it's better to introduce a variation of Array.newInstance given that we want to return an array of Object and not an Object). The secondary type of a value type can be get using a constant dynamic. R?mi ----- Original Message ----- > From: "daniel smith" > To: "valhalla-spec-experts" > Sent: Tuesday, February 7, 2023 2:26:42 AM > Subject: Nullness markers to enable flattening > A quick review: > > The Value Objects feature (see https://openjdk.org/jeps/8277163) captures the > Valhalla project's central idea: that objects don't have to have identity, and > if programmers opt out of identity, JVMs can provide optimizations comparable > to primitive performance. > > However, one important implementation technique is not supported by that JEP: > maximally flattened heap storage. ("Maximally flattened" as in "just the bits > necessary to encode an instance".) This is because flattened fields and arrays > store an object's field values directly, and so 1) need to be initialized "at > birth" to a non-null class instance, 2) may not store null, and 3) may by > updated non-atomically. These are semantics that need to be surfaced in the > language model. > > We've tackled (3) by allowing value classes to be declared non-atomic > (syntax/limitations subject to bikeshedding), and then claiming by fiat that > fields/arrays of such classes are tearing risks. Races are rare enough that > this doesn't really call for a use-site opt-in, and we don't necessarily need > any deeper explanation for how new objects derived from random combinations of > old objects can be created by a read operation. That's just how it works. > > > We also allow value classes to declare that they support an all-zeros default > instance (again, subject to bikeshedding). You could imagine similarly claiming > that fields/arrays of these classes are null-hostile, as a side effect of how > their storage works. But this is an idiosyncrasy that is going to affect a lot > more programmers, and "that's just how it works" is pretty unsatisfactory. > Sometimes programs count on being able to use 'null' in their computation. We > need something in the language model to let programs opt in/out of nulls at the > use site, and thus opt out/in of maximally flattenable heap storage. > > We've long discussed "reference type" vs. "value type" as the language concept > that captures this distinction. But where we once had a long list of > differences between references and values, most of those have gone away. > Notably, it's *not* useful for performance intuitions to imagine that > references are pointers and values are inline. Value objects get inlined when > the JVM want to do so. Reference-ness is not relevant. > > Really, for most programmers, nullness is all that distinguishes a "reference > type" from a "value type". > > Meanwhile, expressing nullness is not a problem unique to Valhalla. Whether a > variable is meant to store nulls is probably the most important property of > most programs that isn't expressible in the language. Workarounds include > informal javadoc specifications, type annotations (as explored by JSpecify), > lots of 'Objects.requireNonNull' calls, and blanket "if you pass in a null, you > might get an NPE" policies. > > In Amber, pattern matching has its own problems with nullness: there are a lot > of ad hoc rules to distinguish between "is this a non-null instance of class > Foo?" vs. "is this null *or* an instance of class Foo?", because there's no > good way to express those two queries as explicitly different. > > --- > > To address these problems, we've been exploring nullness markers as an > alternative to '.val' and '.ref'. The goal is a general-purpose feature that > lets programmers express intent about nulls, and that is preserved at runtime > sufficiently for JVMs to observe that "not null" + "value class" + "non-atomic > (or compact) class" --> "maximally flattenable storage". There are no "value > types", and there is no direct control over flattenability. > > (A lot of these ideas build on what JSpecify has done, so appreciation to them > for the good work and useful documentation.) > > Some key ideas: > > - Nullness is an *optional* property of variables/expressions/etc., distinct > from types. If the program doesn't say what kind of nullness a variable has, > and it can't be inferred, the nullness is "unspecified". (Interpreted as "might > be null, but the programmer hasn't told us if that's their intent".) > Variables/expressions with unspecified nullness continue to behave the way they > always have. > > - Because nullness is distinct from types, it shouldn't impact type checking > rules, subtyping, overriding, conversions, etc. Nullness has its own analysis, > subject to its own errors/warnings. The precise error/warning conditions > haven't been fleshed out, but our bias is towards minimal intrusion?we don't > want to make it hard to adopt these features in targeted ways. > > - That said, *type expressions* (the syntax in programs that expresses a type) > are closely intertwined with *nullness markers*. 'Foo!' refers to a non-null > Foo, and 'Foo?' refers to a Foo or null. And nullness is an optional property > of type arguments, type variable bounds, and array components. Nullness markers > are the way programmers express their intent to the compiler's nullness > analysis. > > - Nullness may also be implicit. Catch parameters and pattern variables are > always non-null. Lots of expressions have '!' nullness, and the null literal > has '?' nullness. Local variables get their nullness from their initializers. > Control flow analysis can infer properties of a variable based on its uses. > > - There are features that change the default interpretation of the nullness of > class names. This is still pretty open-ended. Perhaps certain classes can be > declared (explicitly or implicitly) null-free by default (e.g., 'Point' is > implicitly 'Point!'). Perhaps a compilation-unit- or module- level directive > says that all unadorned types should be interpreted as '!'. Programs can be > usefully written without these convenience features, but for programmers who > want to widely adopt nullness, it will be important to get away from > "unspecified" as the default everywhere. > > - Nullness is generally enforced at run time, via cooperation between javac and > JVMs. Methods with null-free parameters can be expected to throw if a null is > passed in. Null-free storage should reject writes of nulls. (Details to be > worked out, but as a starting point, imagine 'Q'-typed storage for all types. > Writes reject nulls. Reads before any writes produce a default value, or if > none exists, throw.) > > - Type variable types have nullness, too. Besides 'T!' and 'T?', there's also a > "parametric" 'T*' that represents "whatever nullness is provided by the type > argument". (Again, some room for choosing the default interpretation of bare > 'T'; unspecified nullness is useful for type variables as well.) Nullness of > type arguments is inferred along with the types; when both a type argument and > its bound have nullness, bounds checks are based on '!' <: '*' <: '?'. Generics > are erased for now, but in the future '!' type arguments will be reified, and > specialized classes will provide the expected runtime behaviors. > > There are, of course, a lot of details behind these points. But hopefully this > provides a good high-level introduction. > > A worry in taking on extra features like this is that we'll get distracted from > our primary goal, which is to support maximally flattened storage of value > objects. But I think it feels manageable, and it's certainly a lot more useful > than the sort of targeted usage of '.val' we were thinking about before. > > Our main tasks for delivering a feature include: > - Work out the declaration syntax/class file encoding for opting in to > non-atomic-ness and default instances > - Implement nullness markers and some analysis/diagnostics in javac > - Provide a language spec for the parts of the analysis standardized in the > language > - Settle on a class file format and division of responsibility for runtime > behaviors > - Implement some targeted new JVM behaviors; use nullness as a signal for > flattening > - Design/implement how nullness is exposed by reflection > > For the future, we'll want to: > - Anticipate how a "change the defaults" feature will work > - Consider the interaction of nullness with Amber features > - Think about how runtime nullness interacts with specialization and type > restrictions From forax at univ-mlv.fr Wed Feb 8 14:40:46 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 8 Feb 2023 15:40:46 +0100 (CET) Subject: Nullness markers to enable flattening In-Reply-To: References:

Message-ID: <1858023693.15193500.1675867246844.JavaMail.zimbra@u-pem.fr> > From: "Kevin Bourrillion" > To: "daniel smith" > Cc: "valhalla-spec-experts" > Sent: Tuesday, February 7, 2023 7:50:08 PM > Subject: Re: Nullness markers to enable flattening > As the person with a foot in both Valhalla and JSpecify camps, and who is > obsessed with keeping Java as conceptually simple as it can be, as long as this > all works out I'll consider it to be a major win. > On Mon, Feb 6, 2023 at 5:26 PM Dan Smith < [ mailto:daniel.smith at oracle.com | > daniel.smith at oracle.com ] > wrote: >> - Because nullness is distinct from types, it shouldn't impact type checking >> rules, subtyping, overriding, conversions, etc. Nullness has its own analysis, >> subject to its own errors/warnings. The precise error/warning conditions >> haven't been fleshed out, but our bias is towards minimal intrusion?we don't >> want to make it hard to adopt these features in targeted ways. > (The points below are general, but when the type in question is flattenable it > likely makes sense to enforce more strongly.) > I think this is a very salient point: there is a trade-off between enforcing the > new nullness information as well as we might like vs. the adoptability of the > feature. To arrive eventually at a null-safe world, the most importantly early > step is for as many APIs as possible to provide the information. If they worry > about breaking clients already, they won't be able to do it; if that means > javac being lenient on enforcement, that's okay! Third-party nullness analysis > tools have their role to play for a very long time anyway, which is a fine > thing. Really, the "worst" thing about these tools today is simply "ugh, the > type annotations are too bulky", and there's value even in solving *just that*. > In fact, the spec being stringent about which cases definitely produce a warning > might even be counterproductive. Third-party analyzers will analyze more deeply > and take advantage of wider sources of information, and the typical effect is > to recognize that a certain expression *can't* be null when it otherwise seemed > like it could. That means the deeper analysis would end up wanting to > *subtract* warnings from javac, which of course it can't do (currently has no > way to do?). (Again, this applies less to the value types that are the actual > motivating use case here.) >> - That said, *type expressions* (the syntax in programs that expresses a type) >> are closely intertwined with *nullness markers*. 'Foo!' refers to a non-null >> Foo, and 'Foo?' refers to a Foo or null. And nullness is an optional property >> of type arguments, type variable bounds, and array components. Nullness markers >> are the way programmers express their intent to the compiler's nullness >> analysis. > Altogether, I think you're saying that this nullness attribute is attached to > *every* static type recognized by the compiler: types of expressions, of > expression contexts, and the (usually explicit) [ > https://github.com/jspecify/jspecify/wiki/type-usages | type usages ] you > mention. Basically I would expect that most programmers most of the time will > be well-served (enough) to think of this attribute as *if* it's an intrinsic > part of the type. It's sort of the exact illusion this is all trying to create. > (One mild exception is that it is possibly counterproductive to enforce override > strict parameter agreement in the nullness attribute; as far as we can tell, > allowing a nullable parameter to override a non-null one is nearly harmless and > we think makes for a better migration story.) I think we can follow the "generics erasure" playbook here, i.e. if a method has a nullness attribute either an overriden method has a nullness attribute with the same values or it does not have one. This allow to annotate a library with nullness information without having to change the client code that may have a class that override a method from the library. About having a nullable parameter to overrride a non-null one, this goes against the idea that Java parameters are invariant not contravariant (you can not have a method that takes an Object as parameter to override a method that takes a String as parameter even if it is sound in term of type). One advantage i see to keep the method parameter to be invariant is that as a user of a method i know that not only a method but also all its overrides will never allow null, even if the nullness checking is done at declaration site (erasure of the nullness information aside). >> - There are features that change the default interpretation of the nullness of >> class names. This is still pretty open-ended. Perhaps certain classes can be >> declared (explicitly or implicitly) null-free by default (e.g., 'Point' is >> implicitly 'Point!'). Perhaps a compilation-unit- or module- level directive >> says that all unadorned types should be interpreted as '!'. Programs can be >> usefully written without these convenience features, but for programmers who >> want to widely adopt nullness, it will be important to get away from >> "unspecified" as the default everywhere. > If a release has the rest of it but not this part, users will find it extremely > annoying, but it might be survivable if we can see the other side of the chasm > from there. It's definitely essential long-term. > Of course, TypeScript, Scala, C#, and Dart all do it this way, through either a > configuration file or compiler flag etc. Java's analogue is module-info, but > yes, I do think adoptability would benefit *very* much from a class- or > compilation-unit level modifier as well. And (sigh) I guess package-info, even > though when that gets used outside a module it's kind of a disaster. >> - Nullness is generally enforced at run time, via cooperation between javac and >> JVMs. Methods with null-free parameters can be expected to throw if a null is >> passed in. Null-free storage should reject writes of nulls. (Details to be >> worked out, but as a starting point, imagine 'Q'-typed storage for all types. >> Writes reject nulls. Reads before any writes produce a default value, or if >> none exists, throw.) > This raises some questions for the scenarios where we are compelled to write > `@SuppressWarnings("nullness")`; that information obviously isn't available at > runtime. Perhaps there would need to be a more targeted suppression mechanism? In that case, the attribute containing the nullness information can be erased/not generated. >> - Type variable types have nullness, too. Besides 'T!' and 'T?', there's also a >> "parametric" 'T*' that represents "whatever nullness is provided by the type >> argument". (Again, some room for choosing the default interpretation of bare >> 'T'; unspecified nullness is useful for type variables as well.) Nullness of >> type arguments is inferred along with the types; when both a type argument and >> its bound have nullness, bounds checks are based on '!' <: '*' <: '?'. Generics >> are erased for now, but in the future '!' type arguments will be reified, and >> specialized classes will provide the expected runtime behaviors. > I believe you will need the distinction between *unspecified* T and *parametric* > T. This is the second reason why JSpecify needs ` [ > https://jspecify.dev/docs/api/org/jspecify/annotations/NullMarked.html | > @NullMarked ] `; it isn't just "`!` by default"; it's also "type-variable > usages become parametric instead of unspecified". Again, what I believe matters > is that users have some way to express the distinction, not how much or little > javac and the runtime do out of the gates to enforce it. > In summary, I'm extremely glad we're thinking along these lines, and I hope to > be helpful. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Feb 8 15:15:51 2023 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 8 Feb 2023 10:15:51 -0500 Subject: Nullness markers to enable flattening In-Reply-To: <1255107003.15181425.1675866424159.JavaMail.zimbra@u-pem.fr> References: <1255107003.15181425.1675866424159.JavaMail.zimbra@u-pem.fr> Message-ID: On 2/8/2023 9:27 AM, Remi Forax wrote: >> The goal is a general-purpose feature that >> lets programmers express intent about nulls, and that is preserved at runtime >> sufficiently for JVMs to observe that "not null" + "value class" + "non-atomic >> (or compact) class" --> "maximally flattenable storage". There are no "value >> types", and there is no direct control over flattenability. > I would say that "not null" + "zero default value class" is necessary for flattening. > Then "non-atomic class" can help, but there is no guarantee. Layout decisions are the purview of the JVM; Valhalla's approach to flattening is to allow the user to influence flattening by selecting semantic properties that remove impediments to flattening.? An approximate rubric for "when will things flatten" (in the heap) is: ??? type is identity-free ??? && variable is null-free ??? && (type layout is small || type is non-atomic and not huge) There are other heroics the VM can do to get more flattening (e.g., using slack bits to represent nulls in certain types), but this is the basic idea.? And these are all either _semantic_ things the user controls (identity-freedom, nullity, atomicity (which equates to the lack of cross-field invariants)) or things that it is entirely reasonable for the VM to make layout decisions based on (e.g., don't flatten huge things). The main challenge of Valhalla has been teasing apart the various concerns to get them down to fine-grained properties we're willing to ask users to incorporate into their programming. >> - Nullness is generally enforced at run time, via cooperation between javac and >> JVMs. Methods with null-free parameters can be expected to throw if a null is >> passed in. Null-free storage should reject writes of nulls. (Details to be >> worked out, but as a starting point, imagine 'Q'-typed storage for all types. >> Writes reject nulls. Reads before any writes produce a default value, or if >> none exists, throw.) > The last sentence worry me a little, we have types that does not have a default value ?? "Sensible defaults" has been a long-standing challenge.? The VM has a very, very strong bias towards the default state of memory being all zeroes.? It is convenient that this yields sensible defaults for our primitives (numeric zero) and references (null pointers). Nullable types have a good zero default: null.? "Well behaved" value types like Complex have a good zero default: zero.? But types like LocalDate are problematic: the zero maps to a sensible value, but its a terrible default.? So types like LocalDate, while they could admit some flattening (and get good flattening on the stack), need protection in the heap from uninitialized values, and so LocalDate wants to be nullable. That is: if your zero default is bad, you need null as a default. This is one of the central reasons for having Bucket 2.? (Spoiler: I don't like bucket 2.) From daniel.smith at oracle.com Wed Feb 8 16:50:14 2023 From: daniel.smith at oracle.com (Dan Smith) Date: Wed, 8 Feb 2023 16:50:14 +0000 Subject: EG meeting, 2023-02-08 Message-ID: <850D401F-59F3-439B-804E-09E865FCA25D@oracle.com> EG Zoom meeting February 8 at 5pm UTC (9am PDT, 12pm EDT). (Sorry for sending this mail late!) Topics: - Remi's experiments with specialization - Dan's mail about nullness markers From john.r.rose at oracle.com Wed Feb 8 17:19:28 2023 From: john.r.rose at oracle.com (John Rose) Date: Wed, 08 Feb 2023 09:19:28 -0800 Subject: Valhalla super class/interfaces and bridges In-Reply-To: <870810805.12807573.1675675591174.JavaMail.zimbra@u-pem.fr> References: <870810805.12807573.1675675591174.JavaMail.zimbra@u-pem.fr> Message-ID: <001293FA-0746-4865-B1A0-D604B0D95C46@oracle.com> Wow, this is wonderful. I dream of working on this, some day, after we get basic values in the bag. Thank you for the pathfinding. FTR here is my draft code for segmenting the constant pool based on constant variation (dependencies on anchors): https://github.com/rose00/valhalla/blob/1273c0a8904e7a7d465226506361ba8994371bed/src/hotspot/share/classfile/classFileParser.cpp#L1662 From kevinb at google.com Wed Feb 8 20:43:48 2023 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 8 Feb 2023 12:43:48 -0800 Subject: Nullness markers to enable flattening In-Reply-To: References: <1255107003.15181425.1675866424159.JavaMail.zimbra@u-pem.fr> Message-ID: On Wed, Feb 8, 2023 at 7:16 AM Brian Goetz wrote: "Sensible defaults" has been a long-standing challenge. The VM has a > very, very strong bias towards the default state of memory being all > zeroes. It is convenient that this yields sensible defaults for our > primitives (numeric zero) and references (null pointers). Nullable types > have a good zero default: null. "Well behaved" value types like Complex > have a good zero default: zero. To get my stance (that you already know) onto the list: for a nullable type, null is immeasurably *better* than just a "sensible default"; it is *no default at all*. Information is simply missing, never guessed at. Null is a wonderful thing (it's only a null-oblivious type system that makes us think it isn't). Whatever the all-zeroes value is worth for value types is much less clear-cut. (hey, if zero is such a great default, why don't we initialize locals to it too?) A type that models (some approximation of) an algebraic ring is the Very Special Best Use Case. It is blessed to have two clearly-most-common reduction operations, and a single value (zero) that both serves as the identity for one of those operations, and is utterly destructive to the other (i.e., multiplying onto a value without initializing it to 1 first will at least make your bug clearly noticeable!). I feel these discussions over-rotate toward those special cases. Those numeric use cases might be the most *motivating* ones, but that doesn't mean they're necessarily the most common or the most "important" (for some definition). > That is: if your zero default is bad, you need null as a default. This > is one of the central reasons for having Bucket 2. (Spoiler: I don't > like bucket 2.) > And, of course, I (heart) bucket 2. :-) We (Google) expect to send out migration changelists throughout our internal codebase to make every bucket 1 class into a bucket 2 class that we can (mainly not the mutable ones, which our codebase is biased against anyway). So I guess we have a gap here! Refresher for observers: "Bucket 1" is what every class and class-based type is today. Bucket 2 is the same exact thing but without identity. Without identity, you have a number of restrictions -- no non-final fields, no locking, etc. Of course, a whole ton of classes never wanted those things in the first place. Bucket 3 used to refer to an "inlinable" value type in the vein of `int`, but after all the latest changes that have been discussed, it now looks to me like B2 vs. B3 is a decision for the *runtime* to make, not me. (And that's outstanding!) But as regards B1 vs B2 I think it's as pure a case of a backward default as any. If you want identity, of course you can have it. If you want your field to be reassignable or your nested class to have a hidden reference to an instance of the enclosing class, you can have these things. But every reasonable compendium of best-practice advice for Java will tell you to make nested classes static and fields final until you need otherwise. Never let your desire for brevity outweigh the more important considerations. Why would this case be different, I have to wonder? Using B2 by default is both principled and practical. It's how we *defer to the runtime* to make performance decisions for us. And we really really like doing that. And none of this has anything to do with being a special kind of class like `Complex`. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Feb 8 21:10:14 2023 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 8 Feb 2023 16:10:14 -0500 Subject: Nullness markers to enable flattening In-Reply-To: References: <1255107003.15181425.1675866424159.JavaMail.zimbra@u-pem.fr> Message-ID: > To get my stance (that you already know) onto the list: for a nullable > type, null is immeasurably *better* than just a "sensible default"; it > is *no default at all*. Information is simply missing, never guessed > at. Null is a wonderful thing (it's only a null-oblivious type system > that makes us think it isn't). Yes, though we have to be careful to not push this argument too far -- if we do, it leads back to "B1 is all I need", and we can all go home. The reason B1 is not sufficient is that, absent representational heroics (such as using type-specific slack bits to encode null), it basically forces an indirect representation, losing the flatness and density that Valhalla aims to give us.? And the same is true for B2, but less so.? B2 gets half the benefit; by giving up identity, it gets flattening on the stack (calling convention optimization.) That's good, but if that's all we got, it wouldn't be so great. B3 is what it is because the VM has a very strong bias towards initializing to zero.? If the zero-bits are part of your domain and an acceptable default, then great, you have the option to fully flatten.? But if the zero bits are not, then we have no efficient and safe VM representation for non-nullable B2; either we choose indirection (giving up performance) or we use Q types and risk the zero leaking (giving up safety).? B1 doesn't have this problem because what leaks when erasure gets fooled is null, which fails fast when you try to use it, rather than leaking a value that looks like it might be a valid value, but which the constructor would never have generated. So this is why Valhalla distinguishes B3 -- because its the kind of value that we can give the full-flat treatment to.? And while there may be more use cases that are suitable for B2 than B3, there are likely more *instances* of the B3 values -- because we will allocate huge matrices of Complex or HalfFloat. So we can't say "B2 is all we'll need", because B3 is where the performance that motivated Valhalla comes from. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Feb 8 21:35:38 2023 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 8 Feb 2023 22:35:38 +0100 (CET) Subject: Nullness markers to enable flattening In-Reply-To: References: <1255107003.15181425.1675866424159.JavaMail.zimbra@u-pem.fr>

Message-ID: <1422866679.15450393.1675892138444.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Kevin Bourrillion" > Cc: "Remi Forax" , "daniel smith" , > "valhalla-spec-experts" > Sent: Wednesday, February 8, 2023 10:10:14 PM > Subject: Re: Nullness markers to enable flattening >> To get my stance (that you already know) onto the list: for a nullable type, >> null is immeasurably *better* than just a "sensible default"; it is *no default >> at all*. Information is simply missing, never guessed at. Null is a wonderful >> thing (it's only a null-oblivious type system that makes us think it isn't). > Yes, though we have to be careful to not push this argument too far -- if we do, > it leads back to "B1 is all I need", and we can all go home. > The reason B1 is not sufficient is that, absent representational heroics (such > as using type-specific slack bits to encode null), it basically forces an > indirect representation, losing the flatness and density that Valhalla aims to > give us. And the same is true for B2, but less so. B2 gets half the benefit; by > giving up identity, it gets flattening on the stack (calling convention > optimization.) That's good, but if that's all we got, it wouldn't be so great. > B3 is what it is because the VM has a very strong bias towards initializing to > zero. If the zero-bits are part of your domain and an acceptable default, then > great, you have the option to fully flatten. But if the zero bits are not, then > we have no efficient and safe VM representation for non-nullable B2; either we > choose indirection (giving up performance) or we use Q types and risk the zero > leaking (giving up safety). B1 doesn't have this problem because what leaks > when erasure gets fooled is null, which fails fast when you try to use it, > rather than leaking a value that looks like it might be a valid value, but > which the constructor would never have generated. > So this is why Valhalla distinguishes B3 -- because its the kind of value that > we can give the full-flat treatment to. And while there may be more use cases > that are suitable for B2 than B3, there are likely more *instances* of the B3 > values -- because we will allocate huge matrices of Complex or HalfFloat. > So we can't say "B2 is all we'll need", because B3 is where the performance that > motivated Valhalla comes from. I would say, Valhalla has two objectives, providing a more compact memory representation aka B3 is one, having a better escape analysis aka B2 is another. Now, B2 and B3 are from the VM POV, it does not have to leak too much in the user model, we can have value class and zero-default value class at the user level. I kind a like the progression "value class", "zero-default value class", "atomic zero-default value class" that intuitively goes toward better performance at the expense of a non-usual semantics. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Feb 8 21:37:31 2023 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 8 Feb 2023 16:37:31 -0500 Subject: Nullness markers to enable flattening In-Reply-To: <1422866679.15450393.1675892138444.JavaMail.zimbra@u-pem.fr> References: <1255107003.15181425.1675866424159.JavaMail.zimbra@u-pem.fr>

<1422866679.15450393.1675892138444.JavaMail.zimbra@u-pem.fr> Message-ID: <31e04926-14c4-a6b1-4021-6fc4ccb75118@oracle.com> > I would say, Valhalla has two objectives, providing a more compact > memory representation aka B3 is one, having a better escape analysis > aka B2 is another. And you forgot "unifying primitives with objects". -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Feb 8 22:00:52 2023 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 8 Feb 2023 23:00:52 +0100 (CET) Subject: Nullness markers to enable flattening In-Reply-To: <31e04926-14c4-a6b1-4021-6fc4ccb75118@oracle.com> References: <1255107003.15181425.1675866424159.JavaMail.zimbra@u-pem.fr>

<1422866679.15450393.1675892138444.JavaMail.zimbra@u-pem.fr> <31e04926-14c4-a6b1-4021-6fc4ccb75118@oracle.com> Message-ID: <26343913.15464217.1675893652544.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "Kevin Bourrillion" , "daniel smith" > , "valhalla-spec-experts" > > Sent: Wednesday, February 8, 2023 10:37:31 PM > Subject: Re: Nullness markers to enable flattening >> I would say, Valhalla has two objectives, providing a more compact memory >> representation aka B3 is one, having a better escape analysis aka B2 is >> another. > And you forgot "unifying primitives with objects". This is the goal i'm still not able to wrap my head around so i pretend it does not exist :) I wonder if will never be able to truly achieve it, but come close. We have unified primitive wrappers with objects (so == works on Integer) and we can do a little more by adding syntactic sugar around primitives, like auto-boxing int to Integer! when a method is called on it or inside angle brackets of generics. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Feb 18 10:09:37 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 18 Feb 2023 11:09:37 +0100 (CET) Subject: Owner specialization at callsite ? Message-ID: <1036535382.25326481.1676714977379.JavaMail.zimbra@u-pem.fr> I'm trying to implement owner specialization at callsite but i struggle to see the benefits, worst i see a lot of drawbacks. Currently, at callsite, i've implemented specialized generics either when instantiating a parametric generics new ArrayList() or when calling a parametric method List.of() But the Parametric VM spec goes a step further and also ask that the owner of a calling a method can be specialized, for example, with ArrayList list = new ArrayList(); list.add("foo"); in the bytecode list.add("foo") should be typed as ArrayList.add(String) and not ArrayList.add(Object). I've several concerns - i do not see how to implement that without the VM knowing the exact semantics of the Java generics, making Kotlin or Scala second class citizens, - it's not a backward compatible change, a lot of codes will start to throw ClassCastException at runtime, - this will introduce regression in term of performance. To implement owner specialization at callsite, it means that we are able to check at runtime the specialization of a generic class with instanceof/checkcast, something like Object object = ... List list = (List) object; It means to be able to compare the specialization of the instance referenced by "object" with List, so it's a classcheck between two specialized classes. But those classes maybe produces by different compiler with different way of storing the specialized parameters. So either all languages have a kind of common semantics (behave like Java) or there is no way to answer that question. Worst, a lot of code will start to fail because a lot of code relies on the erasure to exist, a good example is the code of List.copyOf() in the JDK static List listCopy(Collection coll) { if (coll instanceof List12 || (coll instanceof ListN c && !c.allowNulls)) { return (List)coll; } else if (coll.isEmpty()) { // implicit nullcheck of coll return List.of(); } else { return (List)List.of(coll.toArray()); } } The first cast is safe because List12 and ListN are covariant (the implementation is non modifiable) something you can not express currently in Java. If this method is specialized and the cast does a check at runtime, List.