Call foreign member functions using Panama
John Rose
john.r.rose at oracle.com
Wed Nov 30 00:38:09 UTC 2022
On 28 Nov 2022, at 3:11, Maurizio Cimadamore wrote:
> Hi,
> as others have pointed out in this thread, while supporting C++ from
> jextract/Panama is not impossible, there are several obstacles to get
> there. The biggest one is that the C++ platform compiler plays a very
> important role in how things get compiled (e.g. which mangled name an
> overloaded member function will get, among other things). Often,
> high-level parsing libraries such as libclang do not offer enough
> information for jextract to be able to reconstruct the full picture.
> Even ignoring low-level details such as vtables and inline functions,
> when we have tried to create a small wrapper around jextract to
> "lower" C++ code into C we ran into several problems esp. around
> templates, as the clang API was not exposing information about
> template instantiation correctly.
Thanks, Maurizio. This is a good discussion to revisit from time to
time. I’d like to give some more details here about our thinking
about addressing API points beyond the system ABI. Please view it as a
very speculative roadmap of possible futures, not as a description of
the present state of Panama technology currently released in the
OpenJDK.
A basic tactic that would serve many (not all) use cases would be to
have jextract emit bindings for C++ API points, with arbitrarily
generated names and plain-C ABI linkage. They would be coded in C++
source code as tiny functions with C linkage, one function per C++ API
point. They would have to be separately compiled by the C++ compiler on
each platform. The jextract tool and Panama loader would have to manage
synthetic DLL assets containing the code for these.
(High performance users will balk at having everything go through tiny
stub functions. For many users, though, it would be helpful to remove
the barriers, even if speedbumps are still present. And progress in
this direction could be followed by work to remove the speedbumps,
pleasing more users.)
I think eventually we will try such a thing, again. (Again, since we
have tried it already, which is why we know what the problems are.) As
Maurizio says, there are at least two or three independent issues here:
1. Getting clang to cough up the details of the C++ API points to
jextract. Apparently this is difficult, because while the C metadata is
stable and complete, the C++ metadata is not. (Last I heard.) This
seems to require some love from the core developers of clang.
2. Getting the user to give enough guidance at jextract time to tell the
system which template instances to build. (Hard problem, perhaps not
possible to solve in complete generality.)
3. Having jextract and the Panama library (and JVM) agree on how to
package the synthetic DLL assets created for C++ API points, especially
if they don’t have stable names in some expanded ABI.
By “expanded ABI” I mean something akin to the system C ABI, but
with extra features, such as extra register usage. Some work on Panama
has prototyped mechanisms for building method handles to any “roll
your own” ABI; this hasn’t been productized because it is a niche
functionality. For example, such mechanisms would be useful for API
points that make heavy use of SIMD registers, heavier use than is
contemplated by the standard ABI.
An alternative to 3 and its synthetic DLL asserts would be to deliver
some other form of asset, such as C++ source code, assembly code,
bitcode, etc., which the JVM could load and transform on the fly to
something it can hook a method handle into. I pick on synthetic DLLs
because that seems the most straightforward way to go, and other
projects have done similar things. But recent very cool work makes it
clear there is may be role, in Panama, for load-time assembly of
callable resources, perhaps as an alternative to loading synthetic DLLs
created by jextract.
Notice that, even if we had a stable C++ ABI to work with, all three
challenges would still be plenty hard.
Here is a sketch of one way to boil down C++ API points (given
sufficient metadata) into plain-C API points, which I made in 2016,
shortly after Panama started. It doesn’t require anybody but the C++
compiler to know about object layouts or v-tables or calling sequences:
http://cr.openjdk.java.net/~jrose/panama/cppapi.cpp.txt
Regarding Problem 2 (user advice on templates), notice that a solution
to that problem would also quickly give rise to a solution for
delivering API points for C inline functions, C function-like macros,
and C object-like macros. The point is that if the user is giving
advice to jextract about template instances, the user can also give
advice about those other constructs (perhaps with a bit more type
information), and the whole mess can be funneled down into a synthetic
DLL with C API points. Just imagine something like a template-instance
declaration, but which names a macro or C inline function, with any
required non-obvious type information. It could be spelled something
like `__JXBIND FILE* __JXCMACRO_stdin;` or `__JXBIND int
__JXCMACRO_max(int,int);`, and the latter could be instantiated as
several overloaded bindings. A small header file, stirred into the
cauldron of jextract input, would be a natural place to position such ad
hoc advice from the user.
I do hope the three hard challenges mentioned above will eventually be
solved at some stage in Panama. Probably one at a time, but probably in
the order given, although Problem 2 could be started on without delving
into clang internals very deeply. Problem 1 is the big blocker, and
requires clang expertise we don’t have in Panama. Perhaps someone in
the community will take a run at one or more?
— John
P.S. I know all three problems are soluble because in the ‘90s I wrote
a C++ binding generator that solved them, for a single platform at a
time, for an embedded Scheme interpreter I was working on. (Instead of
clang, I wrote my own C++ grammar and analyzer. That was a nightmare,
but it opened up some other nice daydreams.) Like Panama, this system
doubled down on dynamic generation of call sites to ABI points, as
directed by metadata either extracted from header files or hand-coded
PINVOKE-style by the user, and it’s not a coincidence that Panama
resembles it in those ways. Making C++ subclasses dynamically from
Scheme code, with lambdas for overrides, and then feeding them to
unsuspecting C++ libraries, was fun and useful. It was only an internal
technology at Sun; the tiny amount of public information about it can be
found at http://rosehome.org/esh/. I moved away from it to Java
quickly.
P.P.S. In view of the above facts, Samuel Audet’s simple statement
that “anything outside platform ABIs for C is out of scope for
Panama” appears misleadingly absolute. The C system ABIs are a sweet
spot, not a limitation, and Panama can today, or may tomorrow, do lots
of non-obvious tricks with ABIs that let us reach many families of API
points beyond C.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20221129/e3666f34/attachment.htm>
More information about the panama-dev
mailing list