The storage hint model

Thu Jul 21 21:15:23 UTC 2022

----- Original Message -----
> From: "Maurizio Cimadamore" <maurizio.cimadamore at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "valhalla-spec-experts" <valhalla-spec-experts at openjdk.java.net>
> Sent: Thursday, July 21, 2022 6:06:36 PM
> Subject: Re: The storage hint model

> On 21/07/2022 14:29, forax at univ-mlv.fr wrote:
>> for me, using .flat or .box is a separate decision than using a storage hint
>> model vs a type based model.
> 
> I'm not sure I'm sold.
> 
> Consider this:
> 
> ```
> class Box<X> {
>   X x;
> }
> ```
> 
> A question users will ask is: under which condition can I expect
> `Box<T>` to use flat representation for its `x` field ?
> 
> Now, in both cases the answer is - it depends, on both the type
> parameter T and what the declaration of `Box::x` looks like.
> 
> But if we adopt a .box approach, I think it will be hard for users not
> to read this as "T flowing into the Box implementation", because
> effectively that's what happens 99% of the times, except if you use the
> .box (or .ref) escape hatch. I other words, in a world where .flat is
> the default, and .box is the opt in, how is that world different from
> what you describe ".val propagation" ?

The main difference is that ".val" requires two types for one value class that are similar but not quite the same.

Differences between C and C.val is really hard to understand because those types are really similar but obey to different rules
- C and C.val have no subtyping relationship but if you see a type as a set, C accept null while C.val don't.
- you have auto-boxing between a C and a C.val
- overloading rules should prefer C.val to C (but maybe not)
- the inference with C.val doe not work exactly like the inference with C 
- an array of C.val is a subtype of an array of C
- List<C> and List<C.val> are incompatible types
- instanceof C.val is not valid
- c.getClass() with c a C.val is typed Class<? extends C> (not C.val).
and this is not an exhaustive list and i'm sure some of them are wrong because even us, experts, have trouble to define the correct set of rules.

Basically, it's a mess because we are creating a new kind of types, C.val, that sometimes works like a primitive (hence boxing, overloading, type inference) but sometimes works like an objet (has Object as super class, have an adhoc way to work with wildcards, etc). And then as a user, there is there is the looming question about where to use C vs C.val, which is will be like a long words essay full of particular cases (like the Angelika Langer FAQ for generics).

Now, the .box world is simpler because .box is a storage hint not a type, but it has one nasty property.
T means different things depending on the context. As type of a parameter or as type of a field, it means may not accept null but as local variable, it always accepts null, so it's quite easy to have a weird NPE.

Here is an example
class Foo<T> {
  T t;   // can be flatten

  void foo(T t) {  // can be flatten
    T other = whatever()? t: null;   // here T allows null
    this.t = other;                  // oops, potential NPE !
  }
}

This can be mitigated in a language like Kotlin that does null analysis or by IDEs, both Eclipse and IntelliJ have null analysis, but it's still ugly.

> 
> I think I know intuitively what you are reaching for - one thing is to
> treat .ref/.box as a type modifier (similar to a wildcard), another
> thing is to apply only to fields, and maybe array creation. But your
> example on ArrayList already veers into method parameters as well. At
> what point does it stop becoming a property of the container and starts
> being a property of the type? Not saying I have a bullet proof answer,
> but this all seems rather fluid to me.

You can define storage hints at only 4 different locations.
- on a field type
- on an array type at creation
- on a field array type
- on parameter type

The first twos are really for storage, the next one is nice because it avoids the VM to ask at runtime if an array allows null or not so it's an optimization and the last one is needed to avoid boxing when crossing inlining "domain", again, it can be seen as an optimization.

> 
> Maurizio

Rémi