[foreign-memaccess] RFR 8228447: Remove kinds from value layouts

Mon Jul 22 15:50:45 UTC 2019

> Where I see it adding value is when a developer
> would like to model a custom ABI - but this is a very advanced use
> case, and I could not find a single framework able to cope with such
> generality (see LLVM discussion linked previously).

The question for me is; how many ABIs do we want to support directly? 
Most native languages seem to support some form of C interop, so we 
could _just_ support C, and if people want to use other languages they 
will have to go through C first. Note that this is also what JNI does 
(it supports C++ only through C++'s ability to use the C ABI to link).

But, if we only support C, we will likely get into a situation where we 
need to do a 'double shuffle' to get all the arguments into the right 
locations, additional register spill/fill, stack re-alignment, basically 
adding an additional call frame. First going from the HotSpot Java ABI 
to C, and then from C to the target ABI (or the reverse for upcalls). 
And actually, we're back to needing glue-code to call anything other 
than C (ABIs).

The ideal picture would seem some way to encode the conversion from the 
Java ABI to the target ABI, preferably in a place that the JIT can see, 
and use things like register aliasing for the incoming Java arguments. 
I.e. we end up doing the minimal amount of work possible (of course 
we're still doing an opaque call into some DLL though).

I think that, if we want to support a decent set of different ABIs, 
we'll pretty much need internal support for 'custom' ABIs any ways. Note 
that we already have such a middle layer in the case of 
UniversalNativeInvoker, where we use a combination of ArgumentBindings 
to encode a CallingSequence from which we derive a ShuffleRecipe. We 
just have to extend this mechanism to cover the other aspects of the ABI 
that are currently cased off with compiler switches in the VM code.

But, beyond this middle layer, and for usability's sake, we need a way 
of encoding a C function descriptor to then translate this, by the rules 
of the chosen C ABI, into the low-level CallingSequence representation 
which is then interpreted (or translated into a forwarding stub) inside 
the VM.

So, the problem we're trying to solve right now it seems is figuring out 
what this high-level, public facing, possibly ABI specific 
representation should look like?

> We need to
> invent a description that is general enough to capture all the details
> the ABI care about. But if we have layouts, why not doubling down and
> reuse those?

Depending on layout annotations seems like a good choice to me, at least 
to get going. We can always evaluate and replace with a more ABI 
specific abstraction later.

Whether we eventually end up exposing our middle layer or not, seems 
like another matter.

---

On a side note; I've been thinking a lot lately about how we could go 
about porting the L2N backend, as well as looking into the current 
implementation. Especially if (and I think this is the assumption at 
this point) we want to support more than just the C ABIs, this seems by 
far the biggest challenge for the foreign-abi branch. I'll try to create 
a thread soon to discuss some of the challenges I've discovered so far.

Jorn

On 2019-07-22 13:45, Maurizio Cimadamore wrote:
> On 22/07/2019 10:30, Maurizio Cimadamore wrote:
>> Or we could enhance the carrier system to cover aggregates - e.g. 
>> StructWithOneInt.class - but if you look at the ABI (esp. SysV) there 
>> are so many combinations to pass small structs into registers that 
>> having one single carrier for each combination is just overkill. And, 
>> needless to say, such carrier would end up replicating some of the 
>> structural info already available in the layout.
> 
> Btw, here I considered four alternatives:
> 
> 1) keep using float, int carriers for scalars; aggregates are
> represented using new carriers e.g. STRUCT_INT_FLOAT - where we have
> one carrier for each classification combo
> 
> 1b) As (1), but let client manually recursively classify aggregates
> and pass them as a sequence of 64-bit words (e.g. longs, or doubles)
> 
> 2) use carriers to model ABI register families - so, floats will be
> passed using some SSE carrier; the number of combination of carriers
> for aggregates is greatly reduced, but all primitives have to be
> 'boxed'
> 
> 2b) As (2), but let client manually recursively classify aggregates
> and pass them as sequences of register carrier values
> 
> No doubt something like (1b), (2) or (2b) could work. But I highly
> doubt that developers (even low level ones) will want to interact with
> such flattened low-level method handles. Even existing FFI libraries
> (e.g. libFFI) don't do this far.
> 
> So, if we went down this path, we would _still_ need something that
> bridges the gap between the developer understanding of a signature and
> the ABI understanding of same signature (e.g. a MH adapter). How do we
> adapt? Here is where I feel we're back to square one. We need to
> invent a description that is general enough to capture all the details
> the ABI care about. But if we have layouts, why not doubling down and
> reuse those?
> 
> In other words, I totally buy that, if we use Layouts to covey
> information about ABI classification might or might not be a
> primitive, and there is a more low level way to think about these
> things. At the same time I'm playing a lumping move here, and saying
> that thinking about the uber FFI low-level in isolation doesn't seem
> to carry much value. Where I see it adding value is when a developer
> would like to model a custom ABI - but this is a very advanced use
> case, and I could not find a single framework able to cope with such
> generality (see LLVM discussion linked previously). So, maybe one day
> we'll want to go down that path - fine; using layouts as a source of
> truth for ABI classification is (i) a forced move in order to provide
> a slightly more friendly FFI integration and (ii) doesn't preclude us
> to add an intermediate layer down the road.
> 
> Maurizio