[foreign-abi] On invokers
Jorn Vernee
jorn.vernee at oracle.com
Mon Sep 23 11:43:30 UTC 2019
Hi,
I've been looking into the current set of invokers we have on the
foreign-abi branch for the past few weeks. There is still work to be
done in this area, both in terms of performance, and in terms of
programmability. In this email I will focus on the latter.
The UniversalNativeInvoker (UNI) API is currently the most programmble
invoker that we have, so if we want to increase the programmability of
our backend to cover more and more ABIs, this seems like a good place to
start. UNI goes a ways in being programmable with the CallingSequence,
ShuffleRecipe and ArgumentBinding APIs, being able to select in which
registers to pass values, but there are still some aspects that could be
polished:
1.) If you look into the VM code that processes the shuffle recipe,
you'll notice that the eventual argument buffer that's being fed to the
stub has a fixed set of registers it can work with on a given platform
[1], namely the ones that are used by the C ABI. This works when we have
only one ABI (C), but for different ABIs we'd probably want a different
set of registers. We can change the stub generation code to take an
'ABIDescriptor' from which we derive the stub and argument buffer layout
instead. This will also provide a place to put other ABI details that
need to be customized, like stack alignment, and argument shadow space
(Windows), as well as a set of volatile registers, which will be a super
set of the argument registers. We would end up generating 1 generic
downcall stub for each ABI. Also, note that we would need to create
architecture definitions on the Java side to be able to specify the
ABIDescriptors there (since ABIs are defined in terms of architecture).
2.) There is a need to pass meta arguments to a function sometimes. For
instance, we need to pass in a pointer to a return buffer for
in-memory-returns, and e.g. on SysV we need to pass in the number of
float arguments in RAX (or rather AL) for variadic functions. The former
is handled automatically by CallingSequenceBuilder, and the latter is
hard-coded in the VM code. Since these are both ABI details, I believe
they should be handled by the ABI implementations. Ideally we'd have an
invoker API that let's us say: "add a Java argument with this carrier
type, and this MemoryLayout, and then shuffle it into this register.",
and then the ABI implementation can handle the further adaptation from
the ABI-level signature (e.g. an additional MemoryAddress passed in as
first argument), to the C-level signature (allocate a buffer as first
argument and also return it). This is mostly a refactoring move in
UNI::invoke and CallingSequenceBuilder that removes the handling for in
memory returns, and replaces it with a more general way of passing those
kinds of arguments.
3.) The unboxing/boxing is currently handled by calling into the various
ABI implementations. We can make this code shared by extending the
current ArgumentBinding 'recipe' to include other operations, besides
moving from a pointer to a register, that cover the things that are
currently handled by the ABI boxing/unboxing implementations. The
various CallingSequenceBuilder implementations can then specify these
additional binding operations when generating bindings. This means that
we only need one shared piece of code that interprets this 'binding
recipe'. The other advantage of doing this is that we would eventually
be able to use these binding recipes + ABIDescriptor to generate a
specialized stub for a particular call site.
4.) We are currently shuffling the arguments for a down call into a
long[], and then in the VM we shuffle the arguments from the long[] into
an argument buffer (ShuffleDowncallContext). We can merge these steps
together, by directly shuffling the arguments into an argument buffer on
the Java side (since we have an off-heap API). This decreases the
overall complexity of the invoker implementation significantly, since we
can drop all the code relating to shuffle recipes.
I've been experimenting with these ideas, and have a prototype for
downcalls on Windows [2]. For this I copied the relevant UNI classes to
a separate `programmable` package and made the relevant changes there,
since some of the code was shared with UniversalUpcallHandler. I've also
preemptively removed the old UNI code (for x86) to show roughly how much
code would be removed by switching to the new invoker API. I want to
continue the experiment for upcalls as well, after which more old code
could be removed; namely Argument, ArgumentBinding, CallingSequence
(old), CallingSeqeunceBuilder (old), Storage, StorageClass, SharedUtils
(mostly) and UniversalAdapter.
How do these ideas sound? I'm mostly interested if this is flexible
enough to support AArch64 and SysV. After the upcall support, I can look
into porting the other 2 ABIs as well.
Thanks,
Jorn
[1] :
https://github.com/openjdk/panama/blob/foreign-abi/src/hotspot/cpu/x86/universalNativeInvoker_x86.cpp#L73
[2] :
https://github.com/openjdk/panama/compare/foreign-abi...JornVernee:prog-back-no-old
More information about the panama-dev
mailing list