[foreign-abi] On invokers

Tue Sep 24 09:42:26 UTC 2019

Yes, you're right that this is not the ultimate solution. It's 
'programmable' in the sense that what was previously hard-coded inside 
the VM code is now passed in dynamically from Java through the API, and 
defined by a particular ABIDescriptor.

These changes are meant to clean up some of the existing technological 
debt we have in the UNI API. The 'programmability' is merely improved to 
be able to describe the quirks in the existing ABIs using more general, 
and hopefully broadly applicable concepts.

I noticed that there is a mismatch between a C-level function descriptor 
and an ABI-level function descriptor, for instance in the case of 
meta-arguments. The current problem is that we give an un-altered 
descriptor to UNI to work with, and then expect the invoker to work out 
the mapping to an ABI-level call, which is inherently ABI-specific. This 
patch adds another layer in between; it's the job of the ABI 
implementation to adapt from the C-level descriptor to the ABI-level 
descriptor. We need to extend the invoker API to provided the right 
primitives for the ABI impl to do that. The API supports creating a 
MethodHandle for an ABI-level call, which is then adapted to C-level by 
the particular ABI impl.

If more programmability is needed in the future this can be added. As 
you say, likely in the areas of Bindings and ABIDescriptor + stub 
generation :). This is just intended to be the next iteration in the 
process.

If you want to do a SysV port that would be much appreciated :). In 
principle everything in the `programmable` package, except for 
CallArranger is a part of the API, though there should be no need to 
interact directly with BindingInterpreter and BufferLayout. So, if you 
want to do a port the only thing you should have to do is add a SysV 
version of CallArranger, (which I based on CallingSequenceBuilderImpl, 
so that's probably a good place to start) and then call that from 
SysVx64ABI.

The things to do are;

1.) Create a SysV ABIDescriptor instance (make sure to add RAX as an 
input storage). The x86_64Architecture class has the `abiFor` method to 
do this

2.) Add a method that takes a C-level FunctionDescriptor + MethodType 
and creates an ABI-level MethodHandle for this using the invoker API, 
and then adapts this back to a C-level MethodHandle. You have to add 
meta-arguments manually (see CallArranger::arrangeDowncall for an 
example with in-memory return).

3.) When generating bindings, map storage indices to VMStorage instances 
using the ABIDescriptor instance you created + StorageClasses defined in 
x86_64Architecture (to get the right VMStorage[] index).

4.) Add additional needed Binding operations when generating the 
bindings, to replace things handled by the box-/unbox-Value methods 
(this was needed for Windows, but maybe not for SysV).

Cheers,
Jorn

On 23/09/2019 22:56, Maurizio Cimadamore wrote:
> Hi Jorn,
> this is a very solid piece of work, and I like how it makes it a lot 
> saner to reason about what's going on in the universal invoker - the 
> entire logic is now more scrutable, so well done!
>
> I think that, in order to understand better how things are glued, I'd 
> like to try to port SysV and see how that goes; from that perspective, 
> I think it's not immediately clear which parts of the Java code are 
> meant to be truly reusable across ABIs/platforms and which one are fixed.
>
> For instance: the ABIDescriptor class looks very general - and it is 
> probably enough to cover the ABIs we have now. But, in a way, I see 
> with it the same problems I see in the current universal invoker - 
> that is, the ABIDescriptor class has to be an 'union' of all the 
> things used in all possible ABIs. That is, a descriptor has to have 
> input storage, output storage, (arguably very general) then more 
> exotic info such as shadow space etc. The fact that - e.g. when using 
> SysV we'd be forced to use an ABIDescriptor that speaks about 'shadow 
> space' (ok we can set it to zero, no big deal) feels wrong in a way, 
> for all the same reason as to why it was wrong in the old universal 
> invoker to speak about x87 registers (which are pretty SysV specific).
>
> Let's imagine to add a very weird ABI which requires a more complex 
> stack alignment scheme (e.g. stack alignment not constant across all 
> args). How would we encode such an hypothetical ABI using the classes 
> provided? Am I right that it would not be possible to do that using 
> the out of the box ABIDescriptor, but that we would have to extend the 
> ABIDescriptor to cover the requirements of the exotic ABI (e.g. have 
> an extra array for the alignments for each arguments), and then leave 
> those bits unused in most cases.
>
> The same goes, I think, for the set of operation supported by bindings 
> and the binding interpreter, which are, I think, supposed to be shared 
> across ABIs/platform. For now we have moves, buffer copy - which are, 
> again, probably ok to cover the waterfront (especially since moves 
> involve arbitrary VM storage, which makes them quite flexible and 
> general); at the same time if a new operation is required by some 
> exotic API (e.g. set overflow flag), maybe we need to add a new 
> binding - which would mean updating the single shared binding 
> interpreter (and then maybe start thinking about which interaction the 
> new binding operation could have with the existing ones, even though, 
> maybe some combinations are not even possible, by ABI design).
>
> In other words, while I can see this being much more general and 
> maintainable than what we have now, I'm not sure I would still call it 
> "pluggable" or "programmable", if you get what I'm saying. That is, at 
> the end of the day the capability of ProgrammableInvoker are fixed by 
> two factors: (i) the ABI descriptor (which seems fixed) and (ii) the 
> set of supported binding operations (again fixed). So, even assuming 
> we had an ultra-low-level API to allow an advanced developer to define 
> his/her own ABI, I could still see ways in which certain ABIs could 
> not be modeled w/o going deeper into the API internals (e.g. tweak ABI 
> descriptor or bindings) or the VM.
>
> And that's ok - as long as we're honest about the goal here: we're not 
> after a single API to rule'em'all (which I think, as attractive as it 
> is, it might be a siren song) - we're after a way to put some method 
> into the ABI madness, so that less work will be required when new ABIs 
> will need to be defined.
>
> Or, did I take a wrong turn somewhere when going through the code?
>
> Maurizio
>
>
> On 23/09/2019 13:32, Jorn Vernee wrote:
>> Here is a webrev version of the changes as well: 
>> http://cr.openjdk.java.net/~jvernee/prog-back/webrev.00/
>>
>> Jorn
>>
>> On 23/09/2019 13:43, Jorn Vernee wrote:
>>> Hi,
>>>
>>> I've been looking into the current set of invokers we have on the 
>>> foreign-abi branch for the past few weeks. There is still work to be 
>>> done in this area, both in terms of performance, and in terms of 
>>> programmability. In this email I will focus on the latter.
>>>
>>> The UniversalNativeInvoker (UNI) API is currently the most 
>>> programmble invoker that we have, so if we want to increase the 
>>> programmability of our backend to cover more and more ABIs, this 
>>> seems like a good place to start. UNI goes a ways in being 
>>> programmable with the CallingSequence, ShuffleRecipe and 
>>> ArgumentBinding APIs, being able to select in which registers to 
>>> pass values, but there are still some aspects that could be polished:
>>>
>>> 1.) If you look into the VM code that processes the shuffle recipe, 
>>> you'll notice that the eventual argument buffer that's being fed to 
>>> the stub has a fixed set of registers it can work with on a given 
>>> platform [1], namely the ones that are used by the C ABI. This works 
>>> when we have only one ABI (C), but for different ABIs we'd probably 
>>> want a different set of registers. We can change the stub generation 
>>> code to take an 'ABIDescriptor' from which we derive the stub and 
>>> argument buffer layout instead. This will also provide a place to 
>>> put other ABI details that need to be customized, like stack 
>>> alignment, and argument shadow space (Windows), as well as a set of 
>>> volatile registers, which will be a super set of the argument 
>>> registers. We would end up generating 1 generic downcall stub for 
>>> each ABI. Also, note that we would need to create architecture 
>>> definitions on the Java side to be able to specify the 
>>> ABIDescriptors there (since ABIs are defined in terms of architecture).
>>>
>>> 2.) There is a need to pass meta arguments to a function sometimes. 
>>> For instance, we need to pass in a pointer to a return buffer for 
>>> in-memory-returns, and e.g. on SysV we need to pass in the number of 
>>> float arguments in RAX (or rather AL) for variadic functions. The 
>>> former is handled automatically by CallingSequenceBuilder, and the 
>>> latter is hard-coded in the VM code. Since these are both ABI 
>>> details, I believe they should be handled by the ABI 
>>> implementations. Ideally we'd have an invoker API that let's us say: 
>>> "add a Java argument with this carrier type, and this MemoryLayout, 
>>> and then shuffle it into this register.", and then the ABI 
>>> implementation can handle the further adaptation from the ABI-level 
>>> signature (e.g. an additional MemoryAddress passed in as first 
>>> argument), to the C-level signature (allocate a buffer as first 
>>> argument and also return it). This is mostly a refactoring move in 
>>> UNI::invoke and CallingSequenceBuilder that removes the handling for 
>>> in memory returns, and replaces it with a more general way of 
>>> passing those kinds of arguments.
>>>
>>> 3.) The unboxing/boxing is currently handled by calling into the 
>>> various ABI implementations. We can make this code shared by 
>>> extending the current ArgumentBinding 'recipe' to include other 
>>> operations, besides moving from a pointer to a register, that cover 
>>> the things that are currently handled by the ABI boxing/unboxing 
>>> implementations. The various CallingSequenceBuilder implementations 
>>> can then specify these additional binding operations when generating 
>>> bindings. This means that we only need one shared piece of code that 
>>> interprets this 'binding recipe'. The other advantage of doing this 
>>> is that we would eventually be able to use these binding recipes + 
>>> ABIDescriptor to generate a specialized stub for a particular call 
>>> site.
>>>
>>> 4.) We are currently shuffling the arguments for a down call into a 
>>> long[], and then in the VM we shuffle the arguments from the long[] 
>>> into an argument buffer (ShuffleDowncallContext). We can merge these 
>>> steps together, by directly shuffling the arguments into an argument 
>>> buffer on the Java side (since we have an off-heap API). This 
>>> decreases the overall complexity of the invoker implementation 
>>> significantly, since we can drop all the code relating to shuffle 
>>> recipes.
>>>
>>> I've been experimenting with these ideas, and have a prototype for 
>>> downcalls on Windows [2]. For this I copied the relevant UNI classes 
>>> to a separate `programmable` package and made the relevant changes 
>>> there, since some of the code was shared with 
>>> UniversalUpcallHandler. I've also preemptively removed the old UNI 
>>> code (for x86) to show roughly how much code would be removed by 
>>> switching to the new invoker API. I want to continue the experiment 
>>> for upcalls as well, after which more old code could be removed; 
>>> namely Argument, ArgumentBinding, CallingSequence (old), 
>>> CallingSeqeunceBuilder (old), Storage, StorageClass, SharedUtils 
>>> (mostly) and UniversalAdapter.
>>>
>>> How do these ideas sound? I'm mostly interested if this is flexible 
>>> enough to support AArch64 and SysV. After the upcall support, I can 
>>> look into porting the other 2 ABIs as well.
>>>
>>> Thanks,
>>> Jorn
>>>
>>> [1] : 
>>> https://github.com/openjdk/panama/blob/foreign-abi/src/hotspot/cpu/x86/universalNativeInvoker_x86.cpp#L73
>>> [2] : 
>>> https://github.com/openjdk/panama/compare/foreign-abi...JornVernee:prog-back-no-old
>>>