Limitations of the Calling Convention Optimization

Wed Oct 21 17:40:57 UTC 2020

On 21/10/2020 18:14, Brian Goetz wrote:
> What I am hoping to get out of this discussion is a better 
> understanding of what API design guidelines we have to give to 
> developers.  The standard advice is "use interfaces in external APIs", 
> and there are very good reasons for this. Interfaces make no 
> commitment as to implementation, allowing implementations to evolve.  
> Maybe there's one implementation now and multiple later; maybe there 
> are multiple now and one later; interfaces give us this freedom (and 
> sealing further gives us the freedom to control this destiny.)
>
> If we have to tell people "if you ever might want to use a primitive 
> class to implement X, you must do that up-front, burn that into your 
> APIs, and permanently give up on ever having multiple 
> implementations", this is a different story, and obviously more limiting.
>
> I think Maurizio had been (reasonably) assuming that, if there was 
> just one inline implementation, we would be able to get some 
> reasonable baseline of performance expectations in the sealed 
> interface + primitive class implementation world, allowing us to 
> continue with the API design principles that have stood us well for 
> decades.  But it sounds like we're not sure about that now.

Yes - here's some context to what I'm doing. There are some interfaces 
in Panama, such as MemorySegment and MemoryAddress which "scream" to be 
implemented by an inline class:

https://docs.oracle.com/en/java/javase/15/docs/api/jdk.incubator.foreign/jdk/incubator/foreign/MemoryAddress.html

https://docs.oracle.com/en/java/javase/15/docs/api/jdk.incubator.foreign/jdk/incubator/foreign/MemorySegment.html

My mental model all along has been that, if we design things carefully, 
we can switch to a world where these interfaces are implemented by a 
regular class to a world where these interfaces are implemented by a 
(single) inline class. Note that the javadoc already states that these 
interfaces are value-based, and that in the future they will become 
sealed - that is, no 3rd party implementation of them is possible (since 
their implementation are so tied to the Panama runtime).

Unfortunately, the problematic idioms referred to in this thread shows 
up quite a bit throughout the API; these are only some of the methods 
which return new instances of these classes:

For instance, in MemoryAddress we find the following methods:

MemoryAddress addOffset(long offset); // returns _new_ address with 
given offset from existing one
static MemoryAddress ofLong(long value); // create _new_ address with 
given long value

And, in MemorySegment, we have many more examples, for instance:

MemorySegment withAccessModes(int modes); // create _new_ segment with 
given access modes
MemorySegment asSlice(long offset, long size); // create _new_ segment 
with given offset and size
static MemorySegment ofArray(byte[]); // create _new_ segment from byte 
heap array

The limitations described in this thread means that, as an API 
developer, the only way for me to guarantee a predictable performance 
model is to replace the MA/MS interfaces with inline classes. Which I 
can do,  no problem (after all the API is still incubating). But I'm 
worried, as Brian says clearly above, as to where this leaves API 
developers in general, where, sometimes, binary compatibility might put 
additional constraints on which refactorings are possible. In other 
words, have an interface being implemented by an inline/primitive class 
is a powerful concept on paper - as it allows _existing_ API to take 
advantage of (some?) the performance optimization promised by Valhalla. 
So, while I understand why things work the way they do (in hindsight the 
nullability problem is rather obvious) this affair let me a bit sad :-)

Cheers
Maurizio

>
> To be clear, this is bigger than calling-convention optimizations; 
> it's about our goals for what language we want to build; it seems like 
> we have been assuming some things that are not quite valid, and its 
> important to understand the implications and whether the requirements 
> need to be adjusted.
>
>
> On 10/21/2020 12:34 PM, John Rose wrote:
>> On Oct 21, 2020, at 5:02 AM, Tobias Hartmann 
>> <tobias.hartmann at oracle.com> wrote:
>>> Also, the VM would need to ensure that the argument/return type is 
>>> eagerly loaded when the adapters
>>> are created at method link time (how do we even know eager loading 
>>> is required for these L-types?).
>> This is the hard kernel of the problem.  The L in LFoo; means
>> “Lazy Loading” (among other things); you simply cannot assume
>> that Foo.class will be loaded when you are setting up layouts and
>> calling sequences for an LFoo;.  To get that information (and more)
>> we will always need a signal somewhere in the call site which says,
>> “do the preload” (which is what I call the “go and look” signal).
>> The Q in QFooVal; performs this duty.
>>
>> That said, we might have wiggle room with sealed interfaces.
>> If there is a “go and look” signal (QFooVal;) nearby that has
>> forced the loading of an inline type FooVal which has an
>> interface type FooRef as a super, *and* if FooRef is sealed,
>> then we can go to work on FooRef calling sequences, if we
>> think it’s profitable.  (We could substitute in a null flag plus
>> a FooVal instance, and we can try to merge the null flag into
>> the “guts” of the FooVal registers, if there is slack.  Or pass
>> the FooVal on stack by reference.  Or other tricks like that.)
>>
>> But, back to the hard kernel of the problem, if there is no
>> mention of QFooVal; near the LFooRef; descriptor, all bets
>> are off; FooRef.class must be assumed to be unavailable in
>> the general case when forming calling sequences involving
>> LFooRef;.
>>
>> The laziness of LFoo; gives it opaqueness, abstraction.  All you
>> know is it’s a pointer; you can’t know (in general) what it
>> points to until Foo.class is loaded and there’s an instance
>> of Foo running around.  Before that it’s just nulls.  The
>> opaqueness of LFoo; in turn means that you don’t know
>> whether it is just a GC heap reference or something special
>> (a disguised value).  When we try to do special side-heap
>> treatment for LFoo; values that we discover are disguised
>> values, we find that it’s difficult to contain the uses of
>> those values; they are so nondescript they mix with
>> Object pointers.  This creates a kind of pollution where
>> suddenly every pointer that might be handed to the GC
>> has to be checked to see if it’s a side-heap pointer (e.g.,
>> thread local or on-stack). This possibility of pollution
>> is introduced by the abstractness of LFoo; and the
>> result of it is hard-to-control extra expenses wherever
>> an opaque pointer must be stored in the real GC heap.
>>
>> Maybe we can slim those expenses down in the future,
>> using ZGC-like colored pointers.  For now, I think the
>> good thing to do is buffer on the real GC heap, as we
>> do now.  Maybe the performance can be increased in
>> the future by using side-heap tricks, after we figure
>> out how to prevent side-heap pointers from polluting
>> the real GC heap.
>>
>> It’s an odd linkage:  the Lazy Loading LFoo;, by its
>> nature, has surprising connections to the GC.
>>
>> — John
>