[foreign-abi] On invokers

Mon Sep 23 20:56:23 UTC 2019

Hi Jorn,
this is a very solid piece of work, and I like how it makes it a lot 
saner to reason about what's going on in the universal invoker - the 
entire logic is now more scrutable, so well done!

I think that, in order to understand better how things are glued, I'd 
like to try to port SysV and see how that goes; from that perspective, I 
think it's not immediately clear which parts of the Java code are meant 
to be truly reusable across ABIs/platforms and which one are fixed.

For instance: the ABIDescriptor class looks very general - and it is 
probably enough to cover the ABIs we have now. But, in a way, I see with 
it the same problems I see in the current universal invoker - that is, 
the ABIDescriptor class has to be an 'union' of all the things used in 
all possible ABIs. That is, a descriptor has to have input storage, 
output storage, (arguably very general) then more exotic info such as 
shadow space etc. The fact that - e.g. when using SysV we'd be forced to 
use an ABIDescriptor that speaks about 'shadow space' (ok we can set it 
to zero, no big deal) feels wrong in a way, for all the same reason as 
to why it was wrong in the old universal invoker to speak about x87 
registers (which are pretty SysV specific).

Let's imagine to add a very weird ABI which requires a more complex 
stack alignment scheme (e.g. stack alignment not constant across all 
args). How would we encode such an hypothetical ABI using the classes 
provided? Am I right that it would not be possible to do that using the 
out of the box ABIDescriptor, but that we would have to extend the 
ABIDescriptor to cover the requirements of the exotic ABI (e.g. have an 
extra array for the alignments for each arguments), and then leave those 
bits unused in most cases.

The same goes, I think, for the set of operation supported by bindings 
and the binding interpreter, which are, I think, supposed to be shared 
across ABIs/platform. For now we have moves, buffer copy - which are, 
again, probably ok to cover the waterfront (especially since moves 
involve arbitrary VM storage, which makes them quite flexible and 
general); at the same time if a new operation is required by some exotic 
API (e.g. set overflow flag), maybe we need to add a new binding - which 
would mean updating the single shared binding interpreter (and then 
maybe start thinking about which interaction the new binding operation 
could have with the existing ones, even though, maybe some combinations 
are not even possible, by ABI design).

In other words, while I can see this being much more general and 
maintainable than what we have now, I'm not sure I would still call it 
"pluggable" or "programmable", if you get what I'm saying. That is, at 
the end of the day the capability of ProgrammableInvoker are fixed by 
two factors: (i) the ABI descriptor (which seems fixed) and (ii) the set 
of supported binding operations (again fixed). So, even assuming we had 
an ultra-low-level API to allow an advanced developer to define his/her 
own ABI, I could still see ways in which certain ABIs could not be 
modeled w/o going deeper into the API internals (e.g. tweak ABI 
descriptor or bindings) or the VM.

And that's ok - as long as we're honest about the goal here: we're not 
after a single API to rule'em'all (which I think, as attractive as it 
is, it might be a siren song) - we're after a way to put some method 
into the ABI madness, so that less work will be required when new ABIs 
will need to be defined.

Or, did I take a wrong turn somewhere when going through the code?

Maurizio

On 23/09/2019 13:32, Jorn Vernee wrote:
> Here is a webrev version of the changes as well: 
> http://cr.openjdk.java.net/~jvernee/prog-back/webrev.00/
>
> Jorn
>
> On 23/09/2019 13:43, Jorn Vernee wrote:
>> Hi,
>>
>> I've been looking into the current set of invokers we have on the 
>> foreign-abi branch for the past few weeks. There is still work to be 
>> done in this area, both in terms of performance, and in terms of 
>> programmability. In this email I will focus on the latter.
>>
>> The UniversalNativeInvoker (UNI) API is currently the most 
>> programmble invoker that we have, so if we want to increase the 
>> programmability of our backend to cover more and more ABIs, this 
>> seems like a good place to start. UNI goes a ways in being 
>> programmable with the CallingSequence, ShuffleRecipe and 
>> ArgumentBinding APIs, being able to select in which registers to pass 
>> values, but there are still some aspects that could be polished:
>>
>> 1.) If you look into the VM code that processes the shuffle recipe, 
>> you'll notice that the eventual argument buffer that's being fed to 
>> the stub has a fixed set of registers it can work with on a given 
>> platform [1], namely the ones that are used by the C ABI. This works 
>> when we have only one ABI (C), but for different ABIs we'd probably 
>> want a different set of registers. We can change the stub generation 
>> code to take an 'ABIDescriptor' from which we derive the stub and 
>> argument buffer layout instead. This will also provide a place to put 
>> other ABI details that need to be customized, like stack alignment, 
>> and argument shadow space (Windows), as well as a set of volatile 
>> registers, which will be a super set of the argument registers. We 
>> would end up generating 1 generic downcall stub for each ABI. Also, 
>> note that we would need to create architecture definitions on the 
>> Java side to be able to specify the ABIDescriptors there (since ABIs 
>> are defined in terms of architecture).
>>
>> 2.) There is a need to pass meta arguments to a function sometimes. 
>> For instance, we need to pass in a pointer to a return buffer for 
>> in-memory-returns, and e.g. on SysV we need to pass in the number of 
>> float arguments in RAX (or rather AL) for variadic functions. The 
>> former is handled automatically by CallingSequenceBuilder, and the 
>> latter is hard-coded in the VM code. Since these are both ABI 
>> details, I believe they should be handled by the ABI implementations. 
>> Ideally we'd have an invoker API that let's us say: "add a Java 
>> argument with this carrier type, and this MemoryLayout, and then 
>> shuffle it into this register.", and then the ABI implementation can 
>> handle the further adaptation from the ABI-level signature (e.g. an 
>> additional MemoryAddress passed in as first argument), to the C-level 
>> signature (allocate a buffer as first argument and also return it). 
>> This is mostly a refactoring move in UNI::invoke and 
>> CallingSequenceBuilder that removes the handling for in memory 
>> returns, and replaces it with a more general way of passing those 
>> kinds of arguments.
>>
>> 3.) The unboxing/boxing is currently handled by calling into the 
>> various ABI implementations. We can make this code shared by 
>> extending the current ArgumentBinding 'recipe' to include other 
>> operations, besides moving from a pointer to a register, that cover 
>> the things that are currently handled by the ABI boxing/unboxing 
>> implementations. The various CallingSequenceBuilder implementations 
>> can then specify these additional binding operations when generating 
>> bindings. This means that we only need one shared piece of code that 
>> interprets this 'binding recipe'. The other advantage of doing this 
>> is that we would eventually be able to use these binding recipes + 
>> ABIDescriptor to generate a specialized stub for a particular call site.
>>
>> 4.) We are currently shuffling the arguments for a down call into a 
>> long[], and then in the VM we shuffle the arguments from the long[] 
>> into an argument buffer (ShuffleDowncallContext). We can merge these 
>> steps together, by directly shuffling the arguments into an argument 
>> buffer on the Java side (since we have an off-heap API). This 
>> decreases the overall complexity of the invoker implementation 
>> significantly, since we can drop all the code relating to shuffle 
>> recipes.
>>
>> I've been experimenting with these ideas, and have a prototype for 
>> downcalls on Windows [2]. For this I copied the relevant UNI classes 
>> to a separate `programmable` package and made the relevant changes 
>> there, since some of the code was shared with UniversalUpcallHandler. 
>> I've also preemptively removed the old UNI code (for x86) to show 
>> roughly how much code would be removed by switching to the new 
>> invoker API. I want to continue the experiment for upcalls as well, 
>> after which more old code could be removed; namely Argument, 
>> ArgumentBinding, CallingSequence (old), CallingSeqeunceBuilder (old), 
>> Storage, StorageClass, SharedUtils (mostly) and UniversalAdapter.
>>
>> How do these ideas sound? I'm mostly interested if this is flexible 
>> enough to support AArch64 and SysV. After the upcall support, I can 
>> look into porting the other 2 ABIs as well.
>>
>> Thanks,
>> Jorn
>>
>> [1] : 
>> https://github.com/openjdk/panama/blob/foreign-abi/src/hotspot/cpu/x86/universalNativeInvoker_x86.cpp#L73
>> [2] : 
>> https://github.com/openjdk/panama/compare/foreign-abi...JornVernee:prog-back-no-old
>>