Call foreign member functions using Panama

John Rose john.r.rose at oracle.com
Wed Nov 30 00:38:09 UTC 2022


On 28 Nov 2022, at 3:11, Maurizio Cimadamore wrote:

> Hi,
> as others have pointed out in this thread, while supporting C++ from 
> jextract/Panama is not impossible, there are several obstacles to get 
> there. The biggest one is that the C++ platform compiler plays a very 
> important role in how things get compiled (e.g. which mangled name an 
> overloaded member function will get, among other things). Often, 
> high-level parsing libraries such as libclang do not offer enough 
> information for jextract to be able to reconstruct the full picture. 
> Even ignoring low-level details such as vtables and inline functions, 
> when we have tried to create a small wrapper around jextract to 
> "lower" C++ code into C we ran into several problems esp. around 
> templates, as the clang API was not exposing information about 
> template instantiation correctly.

Thanks, Maurizio.  This is a good discussion to revisit from time to 
time.  I’d like to give some more details here about our thinking 
about addressing API points beyond the system ABI.  Please view it as a 
very speculative roadmap of possible futures, not as a description of 
the present state of Panama technology currently released in the 
OpenJDK.

A basic tactic that would serve many (not all) use cases would be to 
have jextract emit bindings for C++ API points, with arbitrarily 
generated names and plain-C ABI linkage.  They would be coded in C++ 
source code as tiny functions with C linkage, one function per C++ API 
point.  They would have to be separately compiled by the C++ compiler on 
each platform.  The jextract tool and Panama loader would have to manage 
synthetic DLL assets containing the code for these.

(High performance users will balk at having everything go through tiny 
stub functions.  For many users, though, it would be helpful to remove 
the barriers, even if speedbumps are still present.  And progress in 
this direction could be followed by work to remove the speedbumps, 
pleasing more users.)

I think eventually we will try such a thing, again.  (Again, since we 
have tried it already, which is why we know what the problems are.)  As 
Maurizio says, there are at least two or three independent issues here:

1. Getting clang to cough up the details of the C++ API points to 
jextract.  Apparently this is difficult, because while the C metadata is 
stable and complete, the C++ metadata is not.  (Last I heard.)  This 
seems to require some love from the core developers of clang.

2. Getting the user to give enough guidance at jextract time to tell the 
system which template instances to build.  (Hard problem, perhaps not 
possible to solve in complete generality.)

3. Having jextract and the Panama library (and JVM) agree on how to 
package the synthetic DLL assets created for C++ API points, especially 
if they don’t have stable names in some expanded ABI.

By “expanded ABI” I mean something akin to the system C ABI, but 
with extra features, such as extra register usage.  Some work on Panama 
has prototyped mechanisms for building method handles to any “roll 
your own” ABI; this hasn’t been productized because it is a niche 
functionality.  For example, such mechanisms would be useful for API 
points that make heavy use of SIMD registers, heavier use than is 
contemplated by the standard ABI.

An alternative to 3 and its synthetic DLL asserts would be to deliver 
some other form of asset, such as C++ source code, assembly code, 
bitcode, etc., which the JVM could load and transform on the fly to 
something it can hook a method handle into.  I pick on synthetic DLLs 
because that seems the most straightforward way to go, and other 
projects have done similar things.  But recent very cool work makes it 
clear there is may be role, in Panama, for load-time assembly of 
callable resources, perhaps as an alternative to loading synthetic DLLs 
created by jextract.

Notice that, even if we had a stable C++ ABI to work with, all three 
challenges would still be plenty hard.

Here is a sketch of one way to boil down C++ API points (given 
sufficient metadata) into plain-C API points, which I made in 2016, 
shortly after Panama started.  It doesn’t require anybody but the C++ 
compiler to know about object layouts or v-tables or calling sequences:

http://cr.openjdk.java.net/~jrose/panama/cppapi.cpp.txt

Regarding Problem 2 (user advice on templates), notice that a solution 
to that problem would also quickly give rise to a solution for 
delivering API points for C inline functions, C function-like macros, 
and C object-like macros.  The point is that if the user is giving 
advice to jextract about template instances, the user can also give 
advice about those other constructs (perhaps with a bit more type 
information), and the whole mess can be funneled down into a synthetic 
DLL with C API points.  Just imagine something like a template-instance 
declaration, but which names a macro or C inline function, with any 
required non-obvious type information.  It could be spelled something 
like `__JXBIND FILE* __JXCMACRO_stdin;` or `__JXBIND int 
__JXCMACRO_max(int,int);`, and the latter could be instantiated as 
several overloaded bindings.  A small header file, stirred into the 
cauldron of jextract input, would be a natural place to position such ad 
hoc advice from the user.

I do hope the three hard challenges mentioned above will eventually be 
solved at some stage in Panama.  Probably one at a time, but probably in 
the order given, although Problem 2 could be started on without delving 
into clang internals very deeply.  Problem 1 is the big blocker, and 
requires clang expertise we don’t have in Panama.  Perhaps someone in 
the community will take a run at one or more?

— John

P.S. I know all three problems are soluble because in the ‘90s I wrote 
a C++ binding generator that solved them, for a single platform at a 
time, for an embedded Scheme interpreter I was working on.  (Instead of 
clang, I wrote my own C++ grammar and analyzer.  That was a nightmare, 
but it opened up some other nice daydreams.)  Like Panama, this system 
doubled down on dynamic generation of call sites to ABI points, as 
directed by metadata either extracted from header files or hand-coded 
PINVOKE-style by the user, and it’s not a coincidence that Panama 
resembles it in those ways.  Making C++ subclasses dynamically from 
Scheme code, with lambdas for overrides, and then feeding them to 
unsuspecting C++ libraries, was fun and useful.  It was only an internal 
technology at Sun; the tiny amount of public information about it can be 
found at http://rosehome.org/esh/.  I moved away from it to Java 
quickly.

P.P.S. In view of the above facts, Samuel Audet’s simple statement 
that “anything outside platform ABIs for C is out of scope for 
Panama” appears misleadingly absolute.  The C system ABIs are a sweet 
spot, not a limitation, and Panama can today, or may tomorrow, do lots 
of non-obvious tricks with ABIs that let us reach many families of API 
points beyond C.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20221129/e3666f34/attachment.htm>


More information about the panama-dev mailing list