[foreign-memaccess+abi] RFR: 8255903: Enable multi-register return values for native invokers [v3]

Wed Oct 27 11:01:34 UTC 2021

On Fri, 22 Oct 2021 13:50:47 GMT, Jorn Vernee <jvernee at openjdk.org> wrote:

>> Hi,
>> 
>> This patch implements support for multi-register returns in native invokers, and removes the buffered invocation strategy for downcalls. This is achieved, essentially, by using an in memory return: the caller allocates a bit of memory, and the native invoker stub writes the values of the return register to that memory. Then, the post processing code reads the register values back from there.
>> 
>> Currently, the target address of a downcall is handled separately from the other arguments. I initially implemented passing the IMR address the same way, but I realized the removal of the buffered invocation strategy affords us a better way of doing things: we can just make the target address and the IMR address part of the normal calling sequence, and remove a bunch of special-casing code to handle these. We now just use a binding recipe to unbox these arguments and shuffle them into registers, and the ABIDescriptor tells the native invoker stub which registers to pick up these arguments from.
>> 
>> Thanks,
>> Jorn
>
> Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixup all platform stubs

The changes look good and I like the various simplification. The MH adaptation logic went up in complexity, but this is transient since we're looking into replace combinators with bytecode generation anyway. Overall a good change.

When building and testing the patch I noted one test failure:

TestAarch64CallArranger

It seems varargs related, so I wonder if this patch undoes the fix for 8275332.

src/jdk.incubator.foreign/share/classes/jdk/incubator/foreign/FunctionDescriptor.java line 126:

> 124:      * @return the new function descriptor.
> 125:      */
> 126:     public FunctionDescriptor withAppendedArgumentLayouts(MemoryLayout... addedLayouts) {

I think these should just be called appendArgumentLayouts/insertArgumentLayouts - or, appendArguments/insertArguments - no need for all this verbosity (but not related to your patch).

src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/CallingSequenceBuilder.java line 91:

> 89:     }
> 90: 
> 91:     private boolean isImr() {

I have to say I'm a bit confused by this IMR term. As I'm sure you are aware, we use this term also in the call arrangers, to denote real ABI in-memory-return (e.g. when a struct is too big to be returned in register). The meaning of IMR here is different - and it essentially means: returned in more than one register. I don't think we will be able to merge the two IMR concepts, given one is an ABI concept (e.g. the ABI requires an extra pointer to be passed) and the other is a CLinker specific concept (e.g. an extra segment we pass to store multi-register returns). Because of that, I think we probably need a different name here, to avoid confusion (e.g. `needsReturnBuffer` etc.)

src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/CallingSequenceBuilder.java line 134:

> 132:     private long computeImrSize() {
> 133:         return outputBindings.stream()
> 134:                 .filter(Binding.Move.class ::isInstance)

extra space here before `::`

src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/ProgrammableInvoker.java line 217:

> 215:             for (int j = bindings.size() - 1; j >= 0; j--) {
> 216:                 Binding binding = bindings.get(j);
> 217:                 if (callingSequence.isImr() && binding.tag() == Binding.Tag.VM_LOAD) {

The logic is convoluted (not the fault of this patch), but I believe after reading the code many times, that I'm relatively convinced that it does the same thing as the interpreted path :-)

src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/ProgrammableInvoker.java line 232:

> 230:                     // to (... MemorySegment, MemorySegment, <primitive>, ...)
> 231:                     // from (... MemorySegment, MemorySegment, ...)
> 232:                     retInsertPos -= 2; // set insert pos back the the first MS (later DUP binding will merge the 2 MS)

typo "the the"

src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/ProgrammableInvoker.java line 277:

> 275:                 : Binding.Context.DUMMY;
> 276:         try (unboxContext) {
> 277:             MemorySegment imrSegment = null;

The logic in this method seems fine. I have a question: since IMR is only used when we need to return a segment, and since in these cases the client has to provide a SegmentAllocator, why can't we use the client allocator to create the IMR (and then tell the VM where to store the register values into the user-allocated segment) ? The code seems to allocate a temp segment on the linker scope, copy the multi-reg return in there, then, copy again the data from the temp segment back to the user segment. Which seems odd given that the user is already giving us a segment where to store the results?

src/jdk.incubator.foreign/share/classes/jdk/internal/foreign/abi/SharedUtils.java line 540:

> 538:     }
> 539: 
> 540:     public static MethodHandle maybeInsertAllocator(MethodHandle handle) {

Thanks for adding this and avoid repetition across linker impls

-------------

PR: https://git.openjdk.java.net/panama-foreign/pull/603