Collapsing the requirements

Brian Goetz brian.goetz at oracle.com
Tue Aug 6 16:50:26 UTC 2019


> So, legal signatures will be:
>  - QV;
>  - LI;
> and that’s it, right?
> 
> Q will continue to have its current semantic (flattenable, non-nullable, triggers pre/eager-loading).
> L will continue to have its legacy semantic (indirection, nullable, no new loading rules)

Correct.  Nice and simple!  

> 
>> 
>> Note that the VM can optimize eclairs about as well as it could for LV; it knows that I is the adjunction of null to V, so that all non-null values of I are identity free and must be of type V.
> 
> Optimizing I might require some knowledge about V, but because V <: I, I could be loaded while V is not loaded yet.

If the rule is “always preload Q” (which I think is what John is suggesting), then this case cannot come up, because I’s class file will mention QV.  Similarly, the opposite case does not happen either, as we load super types first, so loading V will trigger loading I.  

Of course, we can twiddle these rules and get different answers, but this is my understanding based on the rules I have heard for load order.

> 
>> 
>> What we lose relative to V? is access to fields; it was possible to do `getfield` on a LV, but not on I.  If this is important (and maybe it’s not), we can handle this in other ways.
> 
> This is related to an open question that shows up in many places in this document.
> What should be the nature of V’s super type? An interface or an abstract class?
> If it is an abstract class, it could declare and access the fields.
> The question expands further than just fields, what’s about methods’ bodies?
> Should they be in V or in I? This has an impact on the type of ’this’ in these
> methods, even if this model has the nice property that ’this’ will always point
> to an instance of V (as long as the JVM protects the model, and prevents external
> forces (JVMTI, Unsafe, etc.) from breaking the special and unique relationship
> between I and V). And the type of ’this’ will also impact the way methods are
> invoked (invokevirtual vs invokeinterface).

There’s a longer discussion to be had about bringing abstract classes and interfaces closer together, or allowing abstract class super types of values, and if so, how.  I have some vague ideas of how the VM and language could handle this combination; rather than dive into that now, I’ll just say that here are the places where the concept of inline-extends-abstract-class has come up:

 - Migrating VBC to inline classes
 - Inline records (as there is an abstract Record super type)
 - Whether ValObject is an interface or an abstract class

Which is to say, we should untangle this knot, which I think is pretty closely related to the RefObject/ValObject knot, so I would think it is best to untangle them together.

> 
>> 
>> #### With sugar on top, please
>> 
>> We can provide syntax sugar (please, let’s not bike shed it now) so that an inline clause _automatically_ acquires a corresponding interface (if one is not explicitly provided), onto which the public members (and type variables, and other super types) of C are lifted.  
> 
> Does the interface only declares public methods, or does it also provide the implementation (default method)?

If we extract an interface from the class mechanically, we would lift the public methods, the super types, and the type variables to the interface.  If the user writes the interface by hand, they will do what they’re going to do.

> 
>> For sake of exposition, let’s say this is called `C.Box` — and is a legitimate inner class of C (which can be generated by the compiler as an ordinary classfile.)  
> 
> Is it a new feature? Or just an idea how it could be implemented in the future?
> 
> Because I’ve tried to compile this:
> 
> public class C implements C.Box {
>    static public interface Box {
> 
>    }
> }

Yes, we would have to address this.  The cycle here is not a real cycle, in that Box does not depend on C for anything, except it happens to live there.  

>> #### Boxing conversion
>> 
>> Given the constraints of the eclair relationship, it would be reasonable for the compiler to derive from this that there is a boxing conversion between C and I (I is just the value set of C, plus null — which is the relationship boxes have with their corresponding primitives.)  The boxing operation is a no-op (since C <: I) and the unboxing operation is a null checking cast.
> 
> Could we assume that boxing/unboxing would be handled by the static compiler (like primitive boxing today),
> and there’s no expectation that the JVM will do magic boxing when needed? (Not considering auto-bridges yet).

Yes.  In fact, we only need this in one direction; since C <: I, the conversion C -> I comes for free (scbtyping), it is only the conversion I -> C that would require an unboxing conversion.  The compiler would introduce the necessary casts (which the VM can optimize to null checks.)  

>> 
>> The world is indeed full of existing utterances of `LOptional`, and they will still want to work.  Fortunately, Optional follows the rules for being a value-based class.  We start with migrating Optional from a reference class to an eclair with a public abstract class and a private value implementation.  Now, existing code just works (source and binary) — and optionals are values.  But, this isn’t good enough; existing variables of type Optional are not flattened.
> 
> Notable difference with previous statements: here the eclair is made of an inline class and an abstract class
> (instead of an inline class and an interface). I assume this is for backward compatibility (Optional’s methods
> are currently invoked using invokevirtual and not invokeinterface).

Correct.  There are multiple ways to handle this.  One is to allow eclairs with abstract classes; another is to blur the distinction between abstract class and interface so that we can make Optional an interface and support the invoke virtual callsites in the wild.  I think I prefer the former, but once we start to untangle the ValObject/RefObject knot, I suspect we’ll know more.

> Having V’s super type be an abstract class, some additional issues have to be considered.
> If both V and V’s super are classes (abstract or not), they both can declare fields, so they
> could end up having different layouts. Even if javac checks against that, manually crafted
> class files and instrumentation frameworks injecting fields (with redefineClass) could create
> situations where a mismatch exists between V and V’s super.
> Would this cause issues? Should the JVM guard against that? To be investigated.

It would have to be worked out.  I think John said something like “let the language guard against inline value classes extending inappropriate abstract classes, and if the VM sees inline class extend an abstract class, ignore the fields and the ctors.  This is probably a reasonable first-order-approximation if we decide to go this route.

> 
>> There are a few ways to get there.  One is to treat this problem as protecting such classes from uninitialized fields or array elements; another is to ensure that such classes (a) have no public fields and (b) perform the correct check at the top of each method (which can be injected by the compiler.)  I don’t want to solve that problem right here, but I think there enough ways to get there that we can assume this isn’t a hard requirement.
> 
> Would (b) be applied to non-static inner inline classes, or are they definitively considered as a lost cause?
> Currently they can throw a NPE which is not so bad after all.

Depends on how early we can guarantee that NPE.  If the class might do a bunch of side effects before hitting the dereference of the outer pointer, then we might leave things in an inconsistent state.  If we can fail faster, that is good.  This area definitely needs investigation.

> The model looks promising, but a more precise specification of eclairs would be helpful
> to estimate the impact on the JVM:
> 
>  - What is the nature of V’s super?
>  - How fields/methods are declared/implemented between V and V’s super? 
>  - Is there any special requirements regarding static members between V and V’s super?
>  - Is there a requirement that V and V’s super share the bodies of their non-static public methods?

Good questions, I hope to have answers eventually.  If you have preferred answers, please share your thinking!




More information about the valhalla-spec-observers mailing list