Limitations of the Calling Convention Optimization

Wed Oct 21 17:14:04 UTC 2020

What I am hoping to get out of this discussion is a better understanding 
of what API design guidelines we have to give to developers.  The 
standard advice is "use interfaces in external APIs", and there are very 
good reasons for this. Interfaces make no commitment as to 
implementation, allowing implementations to evolve.  Maybe there's one 
implementation now and multiple later; maybe there are multiple now and 
one later; interfaces give us this freedom (and sealing further gives us 
the freedom to control this destiny.)

If we have to tell people "if you ever might want to use a primitive 
class to implement X, you must do that up-front, burn that into your 
APIs, and permanently give up on ever having multiple implementations", 
this is a different story, and obviously more limiting.

I think Maurizio had been (reasonably) assuming that, if there was just 
one inline implementation, we would be able to get some reasonable 
baseline of performance expectations in the sealed interface + primitive 
class implementation world, allowing us to continue with the API design 
principles that have stood us well for decades.  But it sounds like 
we're not sure about that now.

To be clear, this is bigger than calling-convention optimizations; it's 
about our goals for what language we want to build; it seems like we 
have been assuming some things that are not quite valid, and its 
important to understand the implications and whether the requirements 
need to be adjusted.

On 10/21/2020 12:34 PM, John Rose wrote:
> On Oct 21, 2020, at 5:02 AM, Tobias Hartmann <tobias.hartmann at oracle.com> wrote:
>> Also, the VM would need to ensure that the argument/return type is eagerly loaded when the adapters
>> are created at method link time (how do we even know eager loading is required for these L-types?).
> This is the hard kernel of the problem.  The L in LFoo; means
> “Lazy Loading” (among other things); you simply cannot assume
> that Foo.class will be loaded when you are setting up layouts and
> calling sequences for an LFoo;.  To get that information (and more)
> we will always need a signal somewhere in the call site which says,
> “do the preload” (which is what I call the “go and look” signal).
> The Q in QFooVal; performs this duty.
>
> That said, we might have wiggle room with sealed interfaces.
> If there is a “go and look” signal (QFooVal;) nearby that has
> forced the loading of an inline type FooVal which has an
> interface type FooRef as a super, *and* if FooRef is sealed,
> then we can go to work on FooRef calling sequences, if we
> think it’s profitable.  (We could substitute in a null flag plus
> a FooVal instance, and we can try to merge the null flag into
> the “guts” of the FooVal registers, if there is slack.  Or pass
> the FooVal on stack by reference.  Or other tricks like that.)
>
> But, back to the hard kernel of the problem, if there is no
> mention of QFooVal; near the LFooRef; descriptor, all bets
> are off; FooRef.class must be assumed to be unavailable in
> the general case when forming calling sequences involving
> LFooRef;.
>
> The laziness of LFoo; gives it opaqueness, abstraction.  All you
> know is it’s a pointer; you can’t know (in general) what it
> points to until Foo.class is loaded and there’s an instance
> of Foo running around.  Before that it’s just nulls.  The
> opaqueness of LFoo; in turn means that you don’t know
> whether it is just a GC heap reference or something special
> (a disguised value).  When we try to do special side-heap
> treatment for LFoo; values that we discover are disguised
> values, we find that it’s difficult to contain the uses of
> those values; they are so nondescript they mix with
> Object pointers.  This creates a kind of pollution where
> suddenly every pointer that might be handed to the GC
> has to be checked to see if it’s a side-heap pointer (e.g.,
> thread local or on-stack). This possibility of pollution
> is introduced by the abstractness of LFoo; and the
> result of it is hard-to-control extra expenses wherever
> an opaque pointer must be stored in the real GC heap.
>
> Maybe we can slim those expenses down in the future,
> using ZGC-like colored pointers.  For now, I think the
> good thing to do is buffer on the real GC heap, as we
> do now.  Maybe the performance can be increased in
> the future by using side-heap tricks, after we figure
> out how to prevent side-heap pointers from polluting
> the real GC heap.
>
> It’s an odd linkage:  the Lazy Loading LFoo;, by its
> nature, has surprising connections to the GC.
>
> — John