Valhalla basic concepts / terminology

Brian Goetz brian.goetz at oracle.com
Fri May 22 21:53:30 UTC 2020


>     To the extent we can avoid redefining these things, I think it is
>     easier to just leave these terms in place.
>
>
> Uh oh, this makes me worry that I'm already supposed to know what the 
> distinction is. :-)
> In my everyday experience, people sometimes use "object" ambiguously 
> to refer to the instance or the class, but "instance" to emphasize 
> the, err, instance.
> Is that it?

Instance is defined relative to a class; a given object might be an 
instance of the class String, or the class 
SimpleBeanProviderFactoryInjector.  But they are both objects.  We don't 
really need two terms, but we've got em.

>
>>       * Identity objects get the benefits of identity, at the cost
>>         that you may only store /references/ to them. They will
>>         ~always go onto the heap (modulo invisible vm tricks).
>>
>     Yes.  Again, pedagogically, I am not sure whether the heap
>     association is helpful or hurtful; on the one hand, C programmers
>     will understand the notion of "pointer to heap node", but on the
>     other, this is focusing on implementation details, not concepts.
>
>
> /But is/ it an implementation detail? To me, this entire feature is 
> about memory layout on its face.

At the language level, there are objects and references to objects, 
which are of different "kinds".  Hitherto, the only way we interacted 
with objects is via references, but now there are new kinds of objects 
that can be interacted with either directly or indirectly.  This are 
type-system concepts. We could have this system today, if we just 
changed the terminology around primitives. But what we don't like about 
this system is the cost of boxing.

At the VM level, it's all references!  But, the VM is smart, so it knows 
how to do magic VM stuff when it encounters references to identity-less 
objects.  This should _routinely_ turn into flattening, though the VM 
reserves the right to refuse to flatten, if, say, your inline class has 
64,000 fields.

I think the key word is _routinely_.  With the information we give the 
VM, we should expect to get certain memory layout behavior most of the 
time.

So, is that implementation detail or not?


>
>>           o (Users choose by e.g. writing either `Foo.val` or
>>             `Foo.ref`, though one would be the default)
>>
>     Yes.  It is worth noting here that we would like for the actual
>     incidence of `.ref` and `.val` in real code to be almost
>     negligible.  Maurizio likens them to "raw types", in the sense
>     that we need them to complete the type system, and there are cases
>     where they are unavoidable, but the other 99.9% of the time, you
>     just say "Point".
>
>
> This aspiration strikes me as highly... aspirational. :-) Isn't the 
> user pushed into this the second they call `Map.get()`?

Well:

     var x = map.get(k);
     if (x != null) { ... }

But, if they want a manifest type, they have two choices:

     V.ref x = map.get()  // oops, caught in Kevin's trap
     V x = map.get()        // uses inline narrowing, which would NPE if 
the item is not in the map

All that said, Map::get is not really a natural idiom in a work with 
inlines, and so we would likely provide other ways to interact with a 
Map that are less null-centric; an optional-bearing version, a pattern 
match, a lambda-accepting version, etc.  So I think this is in the 
category of migration help; we're working with a legacy method whose 
protocol is inherently null-centric.

Where we expect .ref and .val to show up are:

  - Migrated classes -- such as Optional
  - APIs that assume nullability -- such as Map::get
  - When you need to express "V or null"
  - When you explicitly do not want flattening for reasons of memory 
footprint (say, a sparse array of fat inline objects)

(At one point we thought erasure was on this list, but we're having 
second thoughts on that.)

>
>
>>       * A concrete /class/ is either an "identity class" or an
>>         "inline class". But a compile-time /type/ is distinguished
>>         not by "inline vs identity" but by "inline vs /reference/".
>>
>     Yeah, this is the other hard one.  In fact, it took us years to
>     realize that the key distinction is not reference vs
>     primitive/inline, but _identity_ vs inline.
>
>
> Did I have it backwards?

No, you had it right.




More information about the valhalla-spec-observers mailing list