methods with scalarized arguments

Tue May 22 16:03:31 UTC 2018

> Suppose An are argument registers.  We can neglect FP and vector
> regs for now.  For some n, An is in a special stack location, not really
> a register, but that doesn't change the logic of what I'm talking about.
> Then the buffered calling sequence would probably be:
>
> m(A0=v1, A1=v2, A2=v3, A3=v4, A4=v5)
>
> The scalarized calling sequence could be:
>
> m(A0=v1.f1, A1=v1.f2, A2=v1.f3, A3=v1.f4, 
>  A4=v2.f1, A5=v2.f2, A6=v2.f3, A7=v2.f4, 
>  A8=v3.f1, A9=v3.f2, A10=v3.f3, A11=v3.f4, 
>  A12=v4.f1, A13=v4.f2, A14=v4.f3, A15=v4.f4, 
>  A16=v5.f1, A17=v5.f2, A18=v5.f3, A19=v5.f4)
>
> Clearly many of those An will be in the stack.
> Also, it is clear that there is a need for more stack
> here than for the previous calling sequence.
> Is this close to what you are describing?

Yes.

> I think it would be possible (not necessarily desirable—just
> brainstorming here) for compiled-code callers which pass buffered
> value types to *also* allocate enough outgoing argument space
> their stack frame to allow the caller to de-buffer everything.
> That would give us frameless adapters, wouldn't it?
>
> There would have to be some bookkeeping to remember which
> items are value types and which aren't, and calling sequences
> couldn't be invalidated by suddenly loading new value types
> that were (up until now) just unknown types.  But that's not
> a practical problem in the JIT, I think.  Value types are loaded
> and known, mostly, by the time the JIT sets up calls.  There are
> corner cases where nothing is known; in those cases there
> should be a slower handshake of some sort which prevents
> reformatting of arguments.  Idea:  Just like the Linux ABI passes
> a vector count in rax (low byte), we could contrive to pass an
> indication of how prepared the caller is for the callee to unpack
> the arguments.   We would only want to do that for calls which
> are potentially problematic, not all calls, unless the indication
> could be smuggled into the code stream of the caller.  (SPARC
> V8 ABI does the code stream trick also, for struct returns, but
> it's ugly.)

Isn't the problem here again that in a LF, at a method handle linker
call, argument types are Object but could be used to pass any value type
(already loaded or not)? Or we would have to limit how many fields a
scalarized value argument would have to set an upper bound?

>> Is the buffered entry point apparent in C2 IR or
>> is it a custom generated blob of assembly?
>
> IR, I suppose.  There's already C2 code for converting between buffered
> and scalarized views of values.

Wouldn't we have 2 Start nodes in C2 IR then? I wonder how disruptive
that would be but I suppose it's a common assumption that there's only
one and it dominates everything else in the method.

Another example of a tricky calling convention corner case is an
interface call. We call method m() on interface with a value as receiver
except at the call site we can't tell it's a value and pass a buffered
value. On the callee side, m() is one method of a value type and as such
expects the receiver scalarized. So here again there's a mismatch.

Roland.