Reference-default style

Fri Feb 7 23:17:53 UTC 2020

On Feb 7, 2020, at 2:05 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>> 
>> So, summary:
>> 
>>  - Yes, we should figure out how to support abstract class supertypes of inline classes, if only at the VM level;
>>  - There should be one way to declare an inline class, with a modifier saying which projection gets the good name;
>>  - Both the ref and val projections should have the same accessibility, in part so that the compiler can freely use inline widening/narrowing as convenient;
>>  - We would prefer to avoid duplication of the methods on both projections, where possible;
>>  - The migration case requires that, for ref-default inline classes, we translate so that the methods appear on the ref projection.

Abstract classes, check.  User control over good name, check.
Co-accessibility of both projections, check.  No schema duplication,
check.  Methods on ref projection for migration, check.  Awesome!

I’m relieved that we are embracing abstract classes, because (a) the
JVM processes them a little more easily than interfaces, and (b) they
have fewer nit-picky limitations than interfaces (toString/equals/hashCode,
package access members).  Thanks, Dan and whoever else agitated for
abstract classes; the JVM thanks you.

I have a tiny reservation about the co-accessibility of both projections,
although it’s a good principle overall.  There might be cases (migration
and maybe new code) where the nullable type has wider access than
the inline type, where the type’s contract somehow embraces nullability
to the extent that the .val projection is invisible.  But we can cross that
bridge when and if we come to it; I can’t think of compelling examples.

> Let me flesh this out some more, since the previous mail was a bit of a winding lead-up.  
> 
> #### Abstract class supertypes
> 
> It is desirable, both for the migration story and for the language in general, for inline classes to be able to extend abstract classes.  There are restrictions: no fields, no constructors, no instance initializers, no synchronized methods.  These can be checked by the compiler at compile time, and need to be re-checked by the VM at load time.

(Nitpick:  The JVM *fully* checks synchronization of such things dynamically;
it cannot fully check at load time.  Given that, it is not a good idea to partially
check for evidence of synchronization; that just creates the semblance of an
invariant where one does not exist.  The JVM tries hard to make static checks
that actually prove things, rather than just “catch user errors”.  So, please,
no JVM load-time checks for synchronized methods, except *maybe* within
the inline classes themselves.)

> The VM folks have indicated that their preferred way to say "inline-friendly abstract class" is to have only a no-arg constructor which is ACC_ABSTRACT.  For abstract classes that meet the inline-friendly requirement, the static compiler can replace the default constructor we generate now with an abstract one.  The VM would have to be able to deal with subclasses doing `invokespecial <init>` super-calls on these.  

More info, from a JVM perspective:

In that case, and that case alone, the JVM would validly look up the superclass
chain for a non-abstract <init> method, and link to that instead.  This is a very
special case of inheritance where a constructor is inherited and used as-is, rather
than wrapped by a subclass constructor.  It’s a valid operation precisely because the
abstract constructor is provably a no-op.  The Object constructor is the initial
point of this inheritance process, and the end of the upward search.  I’m leaning
towards keeping that as non-abstract, both for compatibility, and as a physical
landing place for the upward search past abstract constructors.  For inlines, we
say that the inline class constructor is required to inherit the Object constructor,
with no non-abstract constructors in intervening supers, and furthermore that
the JVM is allowed to omit the call to the Object constructor.  This amounts to
a special pleading that “everybody knows Object.<init> does nothing”.  Actually
in HotSpot it does something:  For a class with a finalizer it registers something
somewhere.  But that’s precisely irrelevant to inlines.

> 
> My current bikeshed preference for how to indicate these is to do just the test structurally, with good error messages, and back it up with annotation support similar to `@FunctionalInterface` that turns on stricter type checking and documentation support.  (The case we would worry about, which stronger declaration-site indication would help with, would be: a public accidentally-inline-friendly abstract class in one maintenance domain, extended by an inline class in another maintenance domain, and then subsequently the abstract class is modified to, say, add a field.  This could happen, but likely would not happen that often; we can warn users of the risks by additionally issuing a warning on the subclass when the superclass is not marked with the annotation.)

That seems OK, even under restrictions about the effects of annotations.
Annotations which cause the compiler to exit with an error don’t change
the runtime semantics.

And then the translation strategy can say: “I’ve got a new trick up my sleeve!
If the constructor is truly empty, with just a delegating call to my super <init>,
then I can express this condition as an abstract constructor, rather than some
classfile boilerplate.”  As a JVM person, I’m always itchy when somebody pours
boilerplate into classfiles.  Maybe I need to write a “boilerplate considered
harmful” manifesto about classfiles and translation strategies.

> #### Val and ref projections
> 
> …
(Yay!)

> #### Translation -- classfiles 
> 
> A val-default inline class `C` is translated to two classfiles, `C` (val projection) and `C$ref` (ref projection).  A ref-default inline class `D` is translated to two classfiles, `D` (ref projection) and `D$val` (val projection), as follows: 
> 
>  - The two classfiles are members of the same nest.  
>  - The ref projection is a sealed abstract class that permits only the val projection.  
>  - Instance fields are lifted onto the val projection.  
>  - Supertypes, methods (including static methods, and including the static "constructor" factory), and static fields are lifted onto the ref projection.  Method bodies may internally require downcasting to `C.val` to access fields.  

This is a little like MVT, in that inline classes end up containing very little
other than fields.  This is the right move, IMO, for migrated classes.

Hollowing out *all* inline classes strikes me as over-rotation for the sake
of migration.  I see how it allows both cases to have the same translation
strategy, *except for the name*.  That’s a pleasing property on paper.
Maybe I can get used to it, but I’m uncomfortable with loading
everything into the ref class even in the val-default case.  I’d prefer
(if consistency were not an issue) to make the ref class be completely
empty (except for an abstract constructor), just like a marker interface,
for the common case of a val-default.

In the case of reflection, I think we can afford to show a consistent view
for both kinds of inlines, by making all fields and methods appear on
both projections.  In other words, core reflection doesn’t require you
to hunt around through both projections to find some API point; all
API points are present on both projections.  Does anybody see a downside
to that?  If we put API points just where they appear in the classfile,
then people have to hunt around, which is bad, since it’s a translation
strategy option which might conceivably change.  If we put API points
only on the val projection, legacy code will fail for migrated classes.
If we put API points only on the ref projection, then users of val-default
classes will be always fumbling around to fetch the ref projection when
they reflect API points.  So reflecting everything in both places looks
OK to me.  If we support non-sealed abstract supers of inlines (records!)
then the hack on core reflection should copy the API points from the
inline *only* if the super is sealed uniquely to the inline.

> #### Translation -- uses 
> 
> Variables of type `C.ref` are translated as L types (`LC` or `LC$ref`, depending); variables of type `C.val` are translated as Q types (`QC` or `QC$val`, depending.)

The Q-descriptor gives a necessary and sufficient signal to the JVM to load
the inline class and determine its layout.  The JVM is free to reject QR;
where R fails to be an inline class, and the JVM is free to treat LV;, where
V is an inline class, as an ill-defined descriptor, like L__noSuchClass;.
(Note that the JVM does not *reject* ill-defined descriptors; it’s physically
impossible, except in special cases like the resolution of C_MethodType.
Resolving a C_MT of ()LV; should fail, though, if V is an inline.  It was
the compiler’s responsibility to say ()QV; in such a case.)

> `C.val` is widened to `C.ref` by direct assignment, since in the VM, an inline class is related to its supertypes by subtyping.  `C.ref` is narrowed to `C.val` by casting, which the VM can optimize to a null check.  

+1

> Instance field accesses on `C.val` are translated to `getfield`; field accesses on `C.ref` are translated by casting to `C.val` and `getfield`.  

+1

Construction requests on either type are translated to calls to a factory
C.val::<init>.

> Method invocation on `C.val` or `C.ref` can be translated directly, except for private methods, which would require casting `C.val` to `C.ref` first (not because they are inaccessible, but because they are not inherited.)  Same for static fields. 

+0.5; I see this is a place where consistency pays off, just a little, but I’m
still annoyed that the ref class gets all the members except fields and
constructors.  If we flip the other way, then it’s like this:

Method invocation on `C.val` or `C.ref` can be translated directly, except for private methods, which may require casting first *to the class holding the method* (not because they are inaccessible, but because they are not inherited.)  Same for static fields.

This to say “the class holding the method” instead of “the C.ref”, we preserve
the immediate goal of supporting migration of Optional etc., but we incur
some migration debt, because it’s harder to move from val-default to ref-default.
This, I think, is best fixed by adding auto-bridging of some sort later, rather
than over-rotating towards the migration case right now.

(Did I miss some other reason for putting everything on C.ref?)

> Conversion of `C.ref` to supertypes is ordinary subtyping; conversion of `C.val` goes through widening to `C.ref`.  Similarly, `instanceof` on an operand of type `C.val` goes through casting to `C.ref`.

Casting (actually, unboxing) conversion of C.ref to C.val is a regular checkcast.
Conversion (via cast or anything else) of C.val to C.ref is a no-op.

Instanceof never needs a checkcast, because the JVM treats the operand of
instanceof as an untyped reference; there’s nothing new here for instanceof.

> There are other stackings, of course, but this is a starting point, chosen for simplicity and compatibility.

I like it, very very much, with the one reservation harped on above.

— John