[foreign-abi] On invokers

Thu Oct 3 12:30:17 UTC 2019

Hi,

I've finished the work for upcall support. Here is the patch: 
http://cr.openjdk.java.net/~jvernee/prog-back/webrev.02/

This includes both the downcall support I did earlier and now also 
upcall support for Windows, and should give a good picture of the final 
state of things we want to get to. That is, have a single backend, so 
this removes the code for the other backends.

I've also improved the testing for upcalls, by now saving each argument 
that is passed to the upcall into an array, and then checking that array 
against the input arguments at the end of the test (that is what the 
changes to TestUpcall.java are about). Sadly, the old 
CallingSequenceBuilderTests were not compatible with the new API, so 
I've removed them for now, but maybe we can bring these back again later on.

The API is pretty much finalized. The only thing I foresee changing is 
the Binding IR, with things like different operators where they are 
needed to support new ABIs. So, now is a good time to start looking at 
porting to other ABIs.

For x86 this is a little simpler since the support for that particular 
architecture is already there. The only thing that would have to be done 
is taking the old CallingSequenceBuilderImpl, and also have it generate 
the new type of Bindings, as well as handle in memory returns. A good 
example of the latter would be 
CallArranger::arrangeDowcall/arrangeUpcall in the new patch.

For adding support for another architecture there are a few extra steps;
- Add an architecture descriptor, like X86_64Architecture.java does. 
This is basically a set of VMStorage constants that map the different 
registers defined in register_x86.hpp, as well as a set of integer 
constants defining the different storage types (see 
X86_64Architecture::StorageClasses).
- In foreign_globals_XXX.hpp/cpp add an architecture specific version of 
the Java ABIDescriptor/BufferLayout, and a way to translate from the 
Java version to the native version using JNI (see 
foreign_globals_x86.cpp/hpp).
- Change the stub generation code for upcalls and downcalls to use the 
registers and offsets defined by this ABIDescriptor/BufferLayout, 
instead of the ones defined globally for the C ABI (see 
universalNativeInvoker_x86.cpp/universalUpcallHandler_x86.cpp).

Jorn

On 23/09/2019 13:43, Jorn Vernee wrote:
> Hi,
>
> I've been looking into the current set of invokers we have on the 
> foreign-abi branch for the past few weeks. There is still work to be 
> done in this area, both in terms of performance, and in terms of 
> programmability. In this email I will focus on the latter.
>
> The UniversalNativeInvoker (UNI) API is currently the most programmble 
> invoker that we have, so if we want to increase the programmability of 
> our backend to cover more and more ABIs, this seems like a good place 
> to start. UNI goes a ways in being programmable with the 
> CallingSequence, ShuffleRecipe and ArgumentBinding APIs, being able to 
> select in which registers to pass values, but there are still some 
> aspects that could be polished:
>
> 1.) If you look into the VM code that processes the shuffle recipe, 
> you'll notice that the eventual argument buffer that's being fed to 
> the stub has a fixed set of registers it can work with on a given 
> platform [1], namely the ones that are used by the C ABI. This works 
> when we have only one ABI (C), but for different ABIs we'd probably 
> want a different set of registers. We can change the stub generation 
> code to take an 'ABIDescriptor' from which we derive the stub and 
> argument buffer layout instead. This will also provide a place to put 
> other ABI details that need to be customized, like stack alignment, 
> and argument shadow space (Windows), as well as a set of volatile 
> registers, which will be a super set of the argument registers. We 
> would end up generating 1 generic downcall stub for each ABI. Also, 
> note that we would need to create architecture definitions on the Java 
> side to be able to specify the ABIDescriptors there (since ABIs are 
> defined in terms of architecture).
>
> 2.) There is a need to pass meta arguments to a function sometimes. 
> For instance, we need to pass in a pointer to a return buffer for 
> in-memory-returns, and e.g. on SysV we need to pass in the number of 
> float arguments in RAX (or rather AL) for variadic functions. The 
> former is handled automatically by CallingSequenceBuilder, and the 
> latter is hard-coded in the VM code. Since these are both ABI details, 
> I believe they should be handled by the ABI implementations. Ideally 
> we'd have an invoker API that let's us say: "add a Java argument with 
> this carrier type, and this MemoryLayout, and then shuffle it into 
> this register.", and then the ABI implementation can handle the 
> further adaptation from the ABI-level signature (e.g. an additional 
> MemoryAddress passed in as first argument), to the C-level signature 
> (allocate a buffer as first argument and also return it). This is 
> mostly a refactoring move in UNI::invoke and CallingSequenceBuilder 
> that removes the handling for in memory returns, and replaces it with 
> a more general way of passing those kinds of arguments.
>
> 3.) The unboxing/boxing is currently handled by calling into the 
> various ABI implementations. We can make this code shared by extending 
> the current ArgumentBinding 'recipe' to include other operations, 
> besides moving from a pointer to a register, that cover the things 
> that are currently handled by the ABI boxing/unboxing implementations. 
> The various CallingSequenceBuilder implementations can then specify 
> these additional binding operations when generating bindings. This 
> means that we only need one shared piece of code that interprets this 
> 'binding recipe'. The other advantage of doing this is that we would 
> eventually be able to use these binding recipes + ABIDescriptor to 
> generate a specialized stub for a particular call site.
>
> 4.) We are currently shuffling the arguments for a down call into a 
> long[], and then in the VM we shuffle the arguments from the long[] 
> into an argument buffer (ShuffleDowncallContext). We can merge these 
> steps together, by directly shuffling the arguments into an argument 
> buffer on the Java side (since we have an off-heap API). This 
> decreases the overall complexity of the invoker implementation 
> significantly, since we can drop all the code relating to shuffle 
> recipes.
>
> I've been experimenting with these ideas, and have a prototype for 
> downcalls on Windows [2]. For this I copied the relevant UNI classes 
> to a separate `programmable` package and made the relevant changes 
> there, since some of the code was shared with UniversalUpcallHandler. 
> I've also preemptively removed the old UNI code (for x86) to show 
> roughly how much code would be removed by switching to the new invoker 
> API. I want to continue the experiment for upcalls as well, after 
> which more old code could be removed; namely Argument, 
> ArgumentBinding, CallingSequence (old), CallingSeqeunceBuilder (old), 
> Storage, StorageClass, SharedUtils (mostly) and UniversalAdapter.
>
> How do these ideas sound? I'm mostly interested if this is flexible 
> enough to support AArch64 and SysV. After the upcall support, I can 
> look into porting the other 2 ABIs as well.
>
> Thanks,
> Jorn
>
> [1] : 
> https://github.com/openjdk/panama/blob/foreign-abi/src/hotspot/cpu/x86/universalNativeInvoker_x86.cpp#L73
> [2] : 
> https://github.com/openjdk/panama/compare/foreign-abi...JornVernee:prog-back-no-old
>