jextract C++ support

Rel enatai at proton.me
Mon May 22 03:12:49 UTC 2023


> But I believe some more robust
> analysis should be made to understand exactly how many APIs can be
> supported in this "simple" fashion.

Yes, I started to gather such analysis here https://github.com/enatai/panamaexperiments
Currently there is only one happy case [https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp] which is Point2d class from your foo.hpp file.
With your changes in jextractor (cxx branch), it generated bindings properly and HappyTests [https://github.com/enatai/panamaexperiments/blob/main/cppexperiments/src/test/java/cppexperiments/HappyTests.java] passes.

Next, I plan to add some "dynamic dispatch" use cases in particular to experiment with:

>  and you want to call a virtual method, how is that
> supposed to work?


------- Original Message -------
On Wednesday, May 17th, 2023 at 8:36 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:


> I don't disagree with any of your points. But I believe some more robust
> analysis should be made to understand exactly how many APIs can be
> supported in this "simple" fashion.
> 
> While it's true that templates and inline function are just "more code"
> that is generated at compile-time which the shared library knows nothing
> about (we have this issue even for function-like macros in C), it is
> also true that some C++ libraries do tend to use these features somewhat
> heavily.
> 
> Putting these aside, I think the lack of dynamic dispatch is, on the
> whole, the thing the worries me the most. If a library defines a complex
> tree of classes, and you want to call a virtual method, how is that
> supposed to work? Other tools share similar issues:
> 
> https://github.com/rust-lang/rust-bindgen/issues/1309
> 
> Which then leads to open-ended issues like this:
> 
> https://github.com/rust-lang/rust-bindgen/issues/27
> 
> So, while I'm sympathetic with what you say, I think we have to also be
> realistic about what we can achieve with this approach. That said, if it
> turns out that a non-trivial number of C++ libraries are just C
> libraries in disguise, something like this might "work".
> 
> Maurizio
> 
> 
> On 17/05/2023 03:29, Rel wrote:
> 
> > Yes, shim lib will help to overcome many issues and create "decent" support
> > 
> > but let's be honest it has issues:
> > 
> > 1. Users (jextract) need to compile it
> > 2. Users (jextract) need to package shim lib together with their application
> > - and if they package it, then it means every time when application starts, they will need to unpackage it back and this will affect application startup time
> > 4. Users need maintain shim libs for each platform
> > 
> > And now imagine a user who have a C++ library from which they want to call single method X::f. It is not template, it is not inline, it is a simple method which symbol is present in the .so library itself.
> > 
> > Current FLOW using FFM for creating such X.java would be like:
> > - define X layout
> > - write a binding for X ctor
> > - write binding for X::f
> > - copy comments to the method if any
> > 
> > What is good in this FLOW is that users don't need to deal with all side effects of shim lib listed above. And they don't have to, as long as they don't use any "static" features from that C++ library.
> > 
> > I thought that for C++, jextract can define bindings for what is possible (without any shim lib). Later, if users decide that they need some extra "static" features of C++ library, they can bring shim lib (or even use JavaCPP from the start).
> > 
> > Looking at what Maurizio shared it seems that we can let jextract already to automate the FLOW above and extract bindings for what is possible (just writing X layout alone, manually, may not be an easy thing to do).
> > 
> > ------- Original Message -------
> > On Tuesday, May 16th, 2023 at 10:52 AM, Maurizio Cimadamore maurizio.cimadamore at oracle.com wrote:
> > 
> > > Hi
> > > I'd describe more C++ as a sort of ongoing exploration at the moment
> > > (but, our priorities lie in the finalization of the FFM API).
> > > 
> > > Adding some basic support for it is doable - name mangling isn't (as
> > > Manuel says) the biggest concern - after all, libclang gives us all the
> > > correct mangled name, so it's easy to generate a downcall method handle
> > > targeting a mangled symbol name, but expose it as a "nice-looking"
> > > source-like name.
> > > 
> > > A very basic PoC which adds some C++ support can be found here [1]. This
> > > is the result of half a day of hacking on the jextract code, so it is by
> > > no means complete. I'm sharing it here mostly for "educational
> > > purposes", so that I can talk about what I learned from it :-) While "it
> > > works", as noted, there are many things that leave to be desired:
> > > 
> > > * templates do not work correctly
> > > * dynamic dispatch is not supported
> > > * everything that is "inline" doesn't work
> > > * (probably way more stuff, like exceptions, etc.)
> > > 
> > > Some (all?) these limitations are shared across all the tools which
> > > share a similar approach - e.g. Rust's bindgen [2].
> > > 
> > > My personal feeling is that C++ is too much of a stretch for an approach
> > > that targets C++ directly (as done in my patch). As John has noted in
> > > this document [3], adding "decent" support for C++ would require
> > > jextract to generate a shim library on the side, which would help Java
> > > clients perform complex C++ operations which either rely on the
> > > compiler, or the runtime (or both).
> > > 
> > > There might be more than one way to emit this shim library - one would
> > > be to actually compile it and then add a dependency on it from the
> > > generated binidngs (that's the JavaCPP [4] approach). Another approach
> > > could be to embed compiled code, in some way, directly into the bindings
> > > themselves - then at runtime turn the compiled code into a memory
> > > segment, and make it executable. That seems more complex, and I'm not
> > > sure if worth it (but wanted to list the option for completeness). In
> > > that spirit, I also note how there exist some macro assembler options
> > > written using FFM API [5] which might (or not!) play a role in the
> > > translation strategy. Again, mostly jotting some thoughts.
> > > 
> > > No matter which approach is chosen, I think one of the first problem
> > > which would need to address is some way to "lower" a C++ library into
> > > plain C, so as to automate the generation of this shim library (which we
> > > can then link against using FFM API). And, while there have been many
> > > experiments in this area over the years, I didn't come across anything
> > > that seemed "up to date", or directly usable from us. So perhaps I'd
> > > suggest to start from there? Note that that could even be a separate
> > > tool (which then you run jextract against, as usual).
> > > 
> > > [1] -
> > > https://urldefense.com/v3/__https://github.com/openjdk/jextract/compare/panama...mcimadamore:jextract:cxx?expand=1__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlEADQ734$
> > > [2] - https://urldefense.com/v3/__https://rust-lang.github.io/rust-bindgen/cpp.html__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vlh_MvP2g$
> > > [3] - https://cr.openjdk.org/~jrose/panama/cppapi.cpp.txt
> > > [4] - https://urldefense.com/v3/__https://github.com/bytedeco/javacpp__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlxBXb8Xg$
> > > [5] - https://urldefense.com/v3/__https://github.com/YaSuenag/ffmasm__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vl8q86rgc$
> > > 
> > > On 15/05/2023 02:25, Rel wrote:
> > > 
> > > > Hi,
> > > > 
> > > > I would like to know how to participate in C++ support for jextract.
> > > > Watching Project Panama video
> > > > (https://urldefense.com/v3/__https://inside.java/2023/04/18/levelup-panama/__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlC5Rk9qg$ ), Paul mentioned that
> > > > C++ is in the plans.
> > > > Do we have someone working on it already so I can syncup on what is
> > > > the plan and where I can help?
> > > > In particular:
> > > > - will it be part of jextract or may be jextract++?
> > > > - will it use clang or something else? if clang then which interface
> > > > https://urldefense.com/v3/__https://clang.llvm.org/docs/Tooling.html__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vlubk4HkE$
> > > > 
> > > > There are many things to be done for C++ support but if I pick the
> > > > most basic like symbols, in C++ they are mangled so current jextract
> > > > linking logic will need to be changed. Do you think modifying
> > > > NameMangler to store those mangled C++ symbols will be the right approach?
> > > > 
> > > > Regards,


More information about the jextract-dev mailing list