From jvernee at openjdk.org Mon May 1 13:29:24 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 1 May 2023 13:29:24 GMT Subject: RFR: 7903463: jextract generates empty padding layouts In-Reply-To: References: Message-ID: <0-lAxdxJExfNdIakJGJC-IjKM01z8LVAvefsSeXeIv8=.be3ae790-ec6b-4cfd-bf04-2dc5c044dea4@github.com> On Fri, 28 Apr 2023 15:14:33 GMT, Maurizio Cimadamore wrote: > This simple patch fixes a bug where jextract would generate a zero-sized padding layout when handling bitfields. This behavior causes a test failure when running against JDK 21. Marked as reviewed by jvernee (Committer). ------------- PR Review: https://git.openjdk.org/jextract/pull/119#pullrequestreview-1407592310 From mcimadamore at openjdk.org Tue May 2 10:47:38 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Tue, 2 May 2023 10:47:38 GMT Subject: Integrated: 7903463: jextract generates empty padding layouts In-Reply-To: References: Message-ID: On Fri, 28 Apr 2023 15:14:33 GMT, Maurizio Cimadamore wrote: > This simple patch fixes a bug where jextract would generate a zero-sized padding layout when handling bitfields. This behavior causes a test failure when running against JDK 21. This pull request has now been integrated. Changeset: 9feb0422 Author: Maurizio Cimadamore URL: https://git.openjdk.org/jextract/commit/9feb0422b7751d0c2078b4b4255bf3d73b2ace66 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod 7903463: jextract generates empty padding layouts Reviewed-by: jvernee ------------- PR: https://git.openjdk.org/jextract/pull/119 From maurizio.cimadamore at oracle.com Mon May 8 18:26:04 2023 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 8 May 2023 19:26:04 +0100 Subject: GPL license In-Reply-To: <5a5056ff-a079-8560-3562-b496ac2c926e@oracle.com> References: <7D4F17C4-5E11-452D-AE55-D49CCA660A39@gmail.com> <32AAECAB-1887-4E3D-BCA4-DC01C1E11C32@pm.me> <5a5056ff-a079-8560-3562-b496ac2c926e@oracle.com> Message-ID: Hi, we have update the jextract download page [1] with some clarifications re. what's affected by the jextract license: > * The output of jextract does not result in the generated output > being affected by the jextract license. > I hope this addresses the concerns. Cheers Maurizio [1] - https://jdk.java.net/jextract/ On 21/03/2023 23:29, Maurizio Cimadamore wrote: > Thanks for all the comments. It is good (and realistic) feedback, > which I appreciate. > > As you can imagine, there will need to be some kind of internal > discussion on this, to see what options do we have available (if any). > > Maurizio > > > On 21/03/2023 22:02, Shane Pearlman wrote: >> Regarding Sebastian's suggestion, I'm not sure this approach is >> compatible with the current, generic GPL license: >>>> Thus I would support publishing the templates under a permissive >>>> license. >> Here's the GPL exception used by GNU's GCC, so that generated >> binaries are not "infected" by the copyleft >> [https://urldefense.com/v3/__https://github.com/gcc-mirror/gcc/blob/fac64bf456cf56f0c6309d21286b7eaf170f668e/COPYING.RUNTIME__;!!ACWV5N9M2RV99hQ!Kc7RnIZJoBZinQDJFwsEICB-TrxujuavKEMQdMk_i0fzANts8VnNbOS5ZYi7cXrCRD0ISUbgZlZ-85znqvZQe5eRhKVQ3A$ >> ]: >>>> This GCC Runtime Library Exception ("Exception") is an additional >>>> permission under section 7 of the GNU General Public License, >>>> version 3 ("GPLv3"). It applies to a given file (the "Runtime >>>> Library") that bears a notice placed by the copyright holder of the >>>> file stating that the file is governed by GPLv3 along with this >>>> Exception. >>>> When you use GCC to compile a program, GCC may combine portions of >>>> certain GCC header files and runtime libraries with the compiled >>>> program. The purpose of this Exception is to allow compilation of >>>> non-GPL (including proprietary) programs to use, in this way, the >>>> header files and runtime libraries covered by this Exception. >> An exception like that (translated for Java) could permit jextract >> generated code to be used freely.? The OpenJDK may not have such an >> exception, due to the fact that prior to Substrate/Graal native >> compilation, there was no static linking in Java which tends to mix >> one's code altogether with the standard library and runtime itself >> into a binary executable.? I don't believe Oracle has yet addressed >> that problem for the open source version of Graal native image. >> >> To retain compatibility with GPL?which would permit jextract code to >> be integrated or distributed with OpenJDK in the future?and allow >> free use of jextract code elsewhere, I suggest adopting a dual >> license such as those used by javacpp and jna. >> >> javacpp >> [https://urldefense.com/v3/__https://github.com/bytedeco/javacpp/blob/ea4e5f7ca6556455f8e1117ae369f33ca92cd6ca/LICENSE.txt__;!!ACWV5N9M2RV99hQ!Kc7RnIZJoBZinQDJFwsEICB-TrxujuavKEMQdMk_i0fzANts8VnNbOS5ZYi7cXrCRD0ISUbgZlZ-85znqvZQe5ezs0dSEg$ >> ]: >>>> You may use this work under the terms of either the Apache License, >>>> Version 2.0, or the GNU General Public License (GPL), either >>>> version 2, or any later version, with "Classpath" exception >>>> (details below). >> jna >> [https://urldefense.com/v3/__https://github.com/java-native-access/jna/blob/e96f30192e9455e7cc4117cce06fc3fa80bead55/LICENSE__;!!ACWV5N9M2RV99hQ!Kc7RnIZJoBZinQDJFwsEICB-TrxujuavKEMQdMk_i0fzANts8VnNbOS5ZYi7cXrCRD0ISUbgZlZ-85znqvZQe5ehihnx4A$ >> ]: >>>> Java Native Access (JNA) is licensed under the LGPL, version 2.1 or >>>> later, or (from version 4.0 onward) the Apache License, version 2.0. >> Under a dual license, users would be permitted to license jextract >> under the more permissive Apache license rather than the GPL.? Apache >> still requires distributors of compiled, generated jextract bindings >> in proprietary projects to provide attribution and a copy of the >> Apache license required by clause 4A: >>>> You must give any other recipients of the Work or Derivative Works >>>> a copy of this License >> In my opinion, the ideal solution is to adopt a dual license for the >> entire jextract repo, possibly with an exception allowing generated >> bindings to be used freely without attribution or distribution of the >> Apache license, however that burden is small enough that it may not >> warrant invention of a novel, non-standard licensing clause. >> >> ?Shane >> >>> On Mar 21, 2023, at 1:02 PM, Shane Pearlman >>> wrote: >>> >>> Yes Sebastian, that is my point precisely. >>> >>> As far as following OpenJDK license?the use-case is a bit different >>> for the runtime versus a development tool like jextract. >>> >>> I expect most large binding projects will want to modify this code >>> base to do things that are too frail or particular to their C or >>> Java library to do in a general purpose CLI tool?i.e.? generating >>> Javadocs by extracting them from the target project sources, >>> renaming functions based on patterns, generating higher level >>> wrappers around the straight bindings, binding implementation >>> alternatives to tradeoff performance with safety, or automatic type >>> conversion. >>> >>> Case-in-point, there was an earlier message on this list about >>> generating common cross-platform interfaces for the concrete, C >>> pre-processed, platform specific jextraxt bindings classes. I agree >>> with the answer: that is a library specific problem which cannot be >>> solved in the general case, but the jextract implementation classes >>> and libclang bindings can help the author script a custom solution >>> if his library is conventional and large enough to justify automation. >>> >>> The libclang C API itself, for instance, has some ?object oriented? >>> functions that follow a naming convention and pass the first >>> argument as the receiver ?object?.? You can imagine if the binding >>> project had a target codebase with very strong conventions, the >>> author will want to take advantage of them. >>> >>> For reference, the SkiaSharp project has built its own code >>> generation tool according to the structure of the skia libraries.? >>> There are several huge bindings projects from Mono/.NET and they >>> don?t all use the same tools or even the same C++ parsing >>> frameworks.? Custom tools are the real world approach to this messy >>> task of binding, even multiple custom tools within the same >>> organization and client language. >>> >>> GNU themselves give the following caveat on GPL license trade-offs >>> [https://urldefense.com/v3/__https://www.gnu.org/licenses/why-not-lgpl.html__;!!ACWV5N9M2RV99hQ!Kc7RnIZJoBZinQDJFwsEICB-TrxujuavKEMQdMk_i0fzANts8VnNbOS5ZYi7cXrCRD0ISUbgZlZ-85znqvZQe5fruQdn0w$ >>> ]: >>>>> Using the ordinary GPL is not advantageous for every library. >>>>> There are reasons that can make it better to use the Lesser GPL in >>>>> certain cases. The most common case is when a free library's >>>>> features are readily available for proprietary software through >>>>> other libraries. In that case, the library cannot give free >>>>> software any particular advantage, so it is better to use the >>>>> Lesser GPL for that library. >>> (I?m not arguing in favor of LGPL, but the reasoning is the same) >>> >>> There are some projects from various places that can substitute >>> parts of OpenJDK under more permissive licenses, but nobody else has >>> a complete ?classpath? standard library. That?s the main reason >>> OpenJDK can get away with copyleft terms, and there?s so much legacy >>> it would probably be impossible to relicense it anyway.? On the >>> other hand, jextract is young and small and would be easy to >>> relicense at this point.? If there is a question about OpenJDK >>> compatibility, I believe that can be solved by dual licensing under >>> GPL *and* a permissive license. >>> >>> Licenses I would be happier with include: Eclipse EPL, Apache, MIT, >>> BSD, CDDL, Mozilla MPL, etc. >>> >>> >>> ?Shane >>> >>>> On Mar 20, 2023, at 4:51 AM, Sebastian Stenzel >>>> wrote: >>>> >>>>> While the template files are marked as GPLv2 (as they are checked >>>>> into the repository), what comes out of jextract does not have any >>>>> license header >>>> One could argue that ?what comes out of jextract? is a derivative >>>> work of the templates. Thus I would support publishing the >>>> templates under a permissive license. >>>> >>>> Cheers, Sebastian >>>> >>>>> Am 20.03.2023 um 12:23 schrieb Maurizio Cimadamore >>>>> : >>>>> >>>>> ? >>>>> Hi, >>>>> jextract being GPLv2 just follows what the rest of OpenJDK is doing. >>>>> >>>>> I'd like to understand better what your use case is before >>>>> commenting further: are you worried jextract will generate GPLv2 >>>>> code? Because that is NOT the case. While the template files are >>>>> marked as GPLv2 (as they are checked into the repository), what >>>>> comes out of jextract does not have any license header (or at >>>>> least, that's the spirit, if you are experiencing otherwise, I'd >>>>> say that's a bug). >>>>> >>>>> Does that address your concern? >>>>> >>>>> Regards >>>>> Maurizio >>>>> >>>>> >>>>> >>>>> On 20/03/2023 02:11, Shane Pearlman wrote: >>>>>> What?s the reasoning for licensing a tool like this under the GPLv2? >>>>>> >>>>>> Code generators often need to be modified or adapted for large >>>>>> bindings projects, and the classes in org.openjdk.jextract.impl >>>>>> and org.openjdk.jextract.clang could be quite useful as a >>>>>> starting point.? Under the current license, however, I will >>>>>> probably have to roll my own. >>>>>> >>>>>> Even the code generation template classes are under GPLv2, which >>>>>> is enough to prevent me from using jextract generated bindings in >>>>>> my non-GPL projects.? Maybe someone?s reading of the license is >>>>>> that it is permissible, but is the uncertainty really necessary? >>>>>> >>>>>> That said, I am excited for the Panama project to deliver what >>>>>> looks to be a very well designed solution to a major, decade-long >>>>>> problem with Java. >>>>>> >>>>>> ?Shane >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From enatai at proton.me Mon May 15 01:25:53 2023 From: enatai at proton.me (Rel) Date: Mon, 15 May 2023 01:25:53 +0000 Subject: jextract C++ support Message-ID: Hi, I would like to know how to participate in C++ support for jextract. Watching Project Panama video (https://inside.java/2023/04/18/levelup-panama/), Paul mentioned that C++ is in the plans. Do we have someone working on it already so I can syncup on what is the plan and where I can help? In particular: - will it be part of jextract or may be jextract++? - will it use clang or something else? if clang then which interface https://clang.llvm.org/docs/Tooling.html There are many things to be done for C++ support but if I pick the most basic like symbols, in C++ they are mangled so current jextract linking logic will need to be changed. Do you think modifying NameMangler to store those mangled C++ symbols will be the right approach? Regards, -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel.bleichenbacher at gmail.com Mon May 15 05:57:49 2023 From: manuel.bleichenbacher at gmail.com (Manuel Bleichenbacher) Date: Mon, 15 May 2023 07:57:49 +0200 Subject: jextract C++ support In-Reply-To: References: Message-ID: FFM is designed to call code in existing executable library file. But many C++ libraries contain a lot of their code in source code form ? just think of inline functions, header only classes and generics. Name mangling is probably a small challenge compared to the other C++ challenges. So I wonder: is there already a concept for C++ support? Am Mo., 15. Mai 2023 um 03:26 Uhr schrieb Rel : > Hi, > > I would like to know how to participate in C++ support for jextract. > Watching Project Panama video ( > https://inside.java/2023/04/18/levelup-panama/), Paul mentioned that C++ > is in the plans. > Do we have someone working on it already so I can syncup on what is the > plan and where I can help? > In particular: > - will it be part of jextract or may be jextract++? > - will it use clang or something else? if clang then which interface > https://clang.llvm.org/docs/Tooling.html > > There are many things to be done for C++ support but if I pick the most > basic like symbols, in C++ they are mangled so current jextract linking > logic will need to be changed. Do you think modifying NameMangler to store > those mangled C++ symbols will be the right approach? > > Regards, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Tue May 16 10:52:33 2023 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 16 May 2023 11:52:33 +0100 Subject: jextract C++ support In-Reply-To: References: Message-ID: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> Hi I'd describe more C++ as a sort of ongoing exploration at the moment (but, our priorities lie in the finalization of the FFM API). Adding some basic support for it is doable - name mangling isn't (as Manuel says) the biggest concern - after all, libclang gives us all the correct mangled name, so it's easy to generate a downcall method handle targeting a mangled symbol name, but expose it as a "nice-looking" source-like name. A very basic PoC which adds some C++ support can be found here [1]. This is the result of half a day of hacking on the jextract code, so it is by no means complete. I'm sharing it here mostly for "educational purposes", so that I can talk about what I learned from it :-) While "it works", as noted, there are many things that leave to be desired: * templates do not work correctly * dynamic dispatch is not supported * everything that is "inline" doesn't work * (probably way more stuff, like exceptions, etc.) Some (all?) these limitations are shared across all the tools which share a similar approach - e.g. Rust's bindgen [2]. My personal feeling is that C++ is too much of a stretch for an approach that targets C++ directly (as done in my patch). As John has noted in this document [3], adding "decent" support for C++ would require jextract to generate a shim library on the side, which would help Java clients perform complex C++ operations which either rely on the compiler, or the runtime (or both). There might be more than one way to emit this shim library - one would be to actually compile it and then add a dependency on it from the generated binidngs (that's the JavaCPP [4] approach). Another approach could be to embed compiled code, in some way, directly into the bindings themselves - then at runtime turn the compiled code into a memory segment, and make it executable. That seems more complex, and I'm not sure if worth it (but wanted to list the option for completeness). In that spirit, I also note how there exist some macro assembler options written using FFM API [5] which might (or not!) play a role in the translation strategy. Again, mostly jotting some thoughts. No matter which approach is chosen, I think one of the first problem which would need to address is some way to "lower" a C++ library into plain C, so as to automate the generation of this shim library (which we can then link against using FFM API). And, while there have been many experiments in this area over the years, I didn't come across anything that seemed "up to date", or directly usable from us. So perhaps I'd suggest to start from there? Note that that could even be a separate tool (which then you run jextract against, as usual). [1] - https://github.com/openjdk/jextract/compare/panama...mcimadamore:jextract:cxx?expand=1 [2] - https://rust-lang.github.io/rust-bindgen/cpp.html [3] - https://cr.openjdk.org/~jrose/panama/cppapi.cpp.txt [4] - https://github.com/bytedeco/javacpp [5] - https://github.com/YaSuenag/ffmasm On 15/05/2023 02:25, Rel wrote: > Hi, > > I would like to know how to participate in C++ support for jextract. > Watching Project Panama video > (https://inside.java/2023/04/18/levelup-panama/), Paul mentioned that > C++ is in the plans. > Do we have someone working on it already so I can syncup on what is > the plan and where I can help? > In particular: > - will it be part of jextract or may be jextract++? > - will it use clang or something else? if clang then which interface > https://clang.llvm.org/docs/Tooling.html > > There are many things to be done for C++ support but if I pick the > most basic like symbols, in C++ they are mangled so current jextract > linking logic will need to be changed. Do you think modifying > NameMangler to store those mangled C++ symbols will be the right approach? > > Regards, From enatai at proton.me Wed May 17 02:29:58 2023 From: enatai at proton.me (Rel) Date: Wed, 17 May 2023 02:29:58 +0000 Subject: jextract C++ support In-Reply-To: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> References: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> Message-ID: Yes, shim lib will help to overcome many issues and create "decent" support but let's be honest it has issues: 1. Users (jextract) need to compile it 2. Users (jextract) need to package shim lib together with their application - and if they package it, then it means every time when application starts, they will need to unpackage it back and this will affect application startup time 4. Users need maintain shim libs for each platform And now imagine a user who have a C++ library from which they want to call single method X::f. It is not template, it is not inline, it is a simple method which symbol is present in the .so library itself. Current FLOW using FFM for creating such X.java would be like: - define X layout - write a binding for X ctor - write binding for X::f - copy comments to the method if any What is good in this FLOW is that users don't need to deal with all side effects of shim lib listed above. And they don't have to, as long as they don't use any "static" features from that C++ library. I thought that for C++, jextract can define bindings for what is possible (without any shim lib). Later, if users decide that they need some extra "static" features of C++ library, they can bring shim lib (or even use JavaCPP from the start). Looking at what Maurizio shared it seems that we can let jextract already to automate the FLOW above and extract bindings for what is possible (just writing X layout alone, manually, may not be an easy thing to do). ------- Original Message ------- On Tuesday, May 16th, 2023 at 10:52 AM, Maurizio Cimadamore wrote: > Hi > I'd describe more C++ as a sort of ongoing exploration at the moment > (but, our priorities lie in the finalization of the FFM API). > > Adding some basic support for it is doable - name mangling isn't (as > Manuel says) the biggest concern - after all, libclang gives us all the > correct mangled name, so it's easy to generate a downcall method handle > targeting a mangled symbol name, but expose it as a "nice-looking" > source-like name. > > A very basic PoC which adds some C++ support can be found here [1]. This > is the result of half a day of hacking on the jextract code, so it is by > no means complete. I'm sharing it here mostly for "educational > purposes", so that I can talk about what I learned from it :-) While "it > works", as noted, there are many things that leave to be desired: > > * templates do not work correctly > * dynamic dispatch is not supported > * everything that is "inline" doesn't work > * (probably way more stuff, like exceptions, etc.) > > Some (all?) these limitations are shared across all the tools which > share a similar approach - e.g. Rust's bindgen [2]. > > My personal feeling is that C++ is too much of a stretch for an approach > that targets C++ directly (as done in my patch). As John has noted in > this document [3], adding "decent" support for C++ would require > jextract to generate a shim library on the side, which would help Java > clients perform complex C++ operations which either rely on the > compiler, or the runtime (or both). > > There might be more than one way to emit this shim library - one would > be to actually compile it and then add a dependency on it from the > generated binidngs (that's the JavaCPP [4] approach). Another approach > could be to embed compiled code, in some way, directly into the bindings > themselves - then at runtime turn the compiled code into a memory > segment, and make it executable. That seems more complex, and I'm not > sure if worth it (but wanted to list the option for completeness). In > that spirit, I also note how there exist some macro assembler options > written using FFM API [5] which might (or not!) play a role in the > translation strategy. Again, mostly jotting some thoughts. > > No matter which approach is chosen, I think one of the first problem > which would need to address is some way to "lower" a C++ library into > plain C, so as to automate the generation of this shim library (which we > can then link against using FFM API). And, while there have been many > experiments in this area over the years, I didn't come across anything > that seemed "up to date", or directly usable from us. So perhaps I'd > suggest to start from there? Note that that could even be a separate > tool (which then you run jextract against, as usual). > > [1] - > https://github.com/openjdk/jextract/compare/panama...mcimadamore:jextract:cxx?expand=1 > [2] - https://rust-lang.github.io/rust-bindgen/cpp.html > [3] - https://cr.openjdk.org/~jrose/panama/cppapi.cpp.txt > [4] - https://github.com/bytedeco/javacpp > [5] - https://github.com/YaSuenag/ffmasm > > > > On 15/05/2023 02:25, Rel wrote: > > > Hi, > > > > I would like to know how to participate in C++ support for jextract. > > Watching Project Panama video > > (https://inside.java/2023/04/18/levelup-panama/), Paul mentioned that > > C++ is in the plans. > > Do we have someone working on it already so I can syncup on what is > > the plan and where I can help? > > In particular: > > - will it be part of jextract or may be jextract++? > > - will it use clang or something else? if clang then which interface > > https://clang.llvm.org/docs/Tooling.html > > > > There are many things to be done for C++ support but if I pick the > > most basic like symbols, in C++ they are mangled so current jextract > > linking logic will need to be changed. Do you think modifying > > NameMangler to store those mangled C++ symbols will be the right approach? > > > > Regards, From maurizio.cimadamore at oracle.com Wed May 17 08:36:01 2023 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Wed, 17 May 2023 09:36:01 +0100 Subject: jextract C++ support In-Reply-To: References: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> Message-ID: <9997b556-0cd9-b390-e110-bf37cdddca39@oracle.com> I don't disagree with any of your points. But I believe some more robust analysis should be made to understand exactly how many APIs can be supported in this "simple" fashion. While it's true that templates and inline function are just "more code" that is generated at compile-time which the shared library knows nothing about (we have this issue even for function-like macros in C), it is also true that some C++ libraries do tend to use these features somewhat heavily. Putting these aside, I think the lack of dynamic dispatch is, on the whole, the thing the worries me the most. If a library defines a complex tree of classes, and you want to call a virtual method, how is that supposed to work? Other tools share similar issues: https://github.com/rust-lang/rust-bindgen/issues/1309 Which then leads to open-ended issues like this: https://github.com/rust-lang/rust-bindgen/issues/27 So, while I'm sympathetic with what you say, I think we have to also be realistic about what we can achieve with this approach. That said, if it turns out that a non-trivial number of C++ libraries are just C libraries in disguise, something like this might "work". Maurizio On 17/05/2023 03:29, Rel wrote: > Yes, shim lib will help to overcome many issues and create "decent" support > > but let's be honest it has issues: > > 1. Users (jextract) need to compile it > 2. Users (jextract) need to package shim lib together with their application > - and if they package it, then it means every time when application starts, they will need to unpackage it back and this will affect application startup time > 4. Users need maintain shim libs for each platform > > And now imagine a user who have a C++ library from which they want to call single method X::f. It is not template, it is not inline, it is a simple method which symbol is present in the .so library itself. > > Current FLOW using FFM for creating such X.java would be like: > - define X layout > - write a binding for X ctor > - write binding for X::f > - copy comments to the method if any > > What is good in this FLOW is that users don't need to deal with all side effects of shim lib listed above. And they don't have to, as long as they don't use any "static" features from that C++ library. > > I thought that for C++, jextract can define bindings for what is possible (without any shim lib). Later, if users decide that they need some extra "static" features of C++ library, they can bring shim lib (or even use JavaCPP from the start). > > Looking at what Maurizio shared it seems that we can let jextract already to automate the FLOW above and extract bindings for what is possible (just writing X layout alone, manually, may not be an easy thing to do). > > > ------- Original Message ------- > On Tuesday, May 16th, 2023 at 10:52 AM, Maurizio Cimadamore wrote: > > >> Hi >> I'd describe more C++ as a sort of ongoing exploration at the moment >> (but, our priorities lie in the finalization of the FFM API). >> >> Adding some basic support for it is doable - name mangling isn't (as >> Manuel says) the biggest concern - after all, libclang gives us all the >> correct mangled name, so it's easy to generate a downcall method handle >> targeting a mangled symbol name, but expose it as a "nice-looking" >> source-like name. >> >> A very basic PoC which adds some C++ support can be found here [1]. This >> is the result of half a day of hacking on the jextract code, so it is by >> no means complete. I'm sharing it here mostly for "educational >> purposes", so that I can talk about what I learned from it :-) While "it >> works", as noted, there are many things that leave to be desired: >> >> * templates do not work correctly >> * dynamic dispatch is not supported >> * everything that is "inline" doesn't work >> * (probably way more stuff, like exceptions, etc.) >> >> Some (all?) these limitations are shared across all the tools which >> share a similar approach - e.g. Rust's bindgen [2]. >> >> My personal feeling is that C++ is too much of a stretch for an approach >> that targets C++ directly (as done in my patch). As John has noted in >> this document [3], adding "decent" support for C++ would require >> jextract to generate a shim library on the side, which would help Java >> clients perform complex C++ operations which either rely on the >> compiler, or the runtime (or both). >> >> There might be more than one way to emit this shim library - one would >> be to actually compile it and then add a dependency on it from the >> generated binidngs (that's the JavaCPP [4] approach). Another approach >> could be to embed compiled code, in some way, directly into the bindings >> themselves - then at runtime turn the compiled code into a memory >> segment, and make it executable. That seems more complex, and I'm not >> sure if worth it (but wanted to list the option for completeness). In >> that spirit, I also note how there exist some macro assembler options >> written using FFM API [5] which might (or not!) play a role in the >> translation strategy. Again, mostly jotting some thoughts. >> >> No matter which approach is chosen, I think one of the first problem >> which would need to address is some way to "lower" a C++ library into >> plain C, so as to automate the generation of this shim library (which we >> can then link against using FFM API). And, while there have been many >> experiments in this area over the years, I didn't come across anything >> that seemed "up to date", or directly usable from us. So perhaps I'd >> suggest to start from there? Note that that could even be a separate >> tool (which then you run jextract against, as usual). >> >> [1] - >> https://urldefense.com/v3/__https://github.com/openjdk/jextract/compare/panama...mcimadamore:jextract:cxx?expand=1__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlEADQ734$ >> [2] - https://urldefense.com/v3/__https://rust-lang.github.io/rust-bindgen/cpp.html__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vlh_MvP2g$ >> [3] - https://cr.openjdk.org/~jrose/panama/cppapi.cpp.txt >> [4] - https://urldefense.com/v3/__https://github.com/bytedeco/javacpp__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlxBXb8Xg$ >> [5] - https://urldefense.com/v3/__https://github.com/YaSuenag/ffmasm__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vl8q86rgc$ >> >> >> >> On 15/05/2023 02:25, Rel wrote: >> >>> Hi, >>> >>> I would like to know how to participate in C++ support for jextract. >>> Watching Project Panama video >>> (https://urldefense.com/v3/__https://inside.java/2023/04/18/levelup-panama/__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlC5Rk9qg$ ), Paul mentioned that >>> C++ is in the plans. >>> Do we have someone working on it already so I can syncup on what is >>> the plan and where I can help? >>> In particular: >>> - will it be part of jextract or may be jextract++? >>> - will it use clang or something else? if clang then which interface >>> https://urldefense.com/v3/__https://clang.llvm.org/docs/Tooling.html__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vlubk4HkE$ >>> >>> There are many things to be done for C++ support but if I pick the >>> most basic like symbols, in C++ they are mangled so current jextract >>> linking logic will need to be changed. Do you think modifying >>> NameMangler to store those mangled C++ symbols will be the right approach? >>> >>> Regards, From enatai at proton.me Mon May 22 03:12:49 2023 From: enatai at proton.me (Rel) Date: Mon, 22 May 2023 03:12:49 +0000 Subject: jextract C++ support In-Reply-To: <9997b556-0cd9-b390-e110-bf37cdddca39@oracle.com> References: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> <9997b556-0cd9-b390-e110-bf37cdddca39@oracle.com> Message-ID: > But I believe some more robust > analysis should be made to understand exactly how many APIs can be > supported in this "simple" fashion. Yes, I started to gather such analysis here https://github.com/enatai/panamaexperiments Currently there is only one happy case [https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp] which is Point2d class from your foo.hpp file. With your changes in jextractor (cxx branch), it generated bindings properly and HappyTests [https://github.com/enatai/panamaexperiments/blob/main/cppexperiments/src/test/java/cppexperiments/HappyTests.java] passes. Next, I plan to add some "dynamic dispatch" use cases in particular to experiment with: > and you want to call a virtual method, how is that > supposed to work? ------- Original Message ------- On Wednesday, May 17th, 2023 at 8:36 AM, Maurizio Cimadamore wrote: > I don't disagree with any of your points. But I believe some more robust > analysis should be made to understand exactly how many APIs can be > supported in this "simple" fashion. > > While it's true that templates and inline function are just "more code" > that is generated at compile-time which the shared library knows nothing > about (we have this issue even for function-like macros in C), it is > also true that some C++ libraries do tend to use these features somewhat > heavily. > > Putting these aside, I think the lack of dynamic dispatch is, on the > whole, the thing the worries me the most. If a library defines a complex > tree of classes, and you want to call a virtual method, how is that > supposed to work? Other tools share similar issues: > > https://github.com/rust-lang/rust-bindgen/issues/1309 > > Which then leads to open-ended issues like this: > > https://github.com/rust-lang/rust-bindgen/issues/27 > > So, while I'm sympathetic with what you say, I think we have to also be > realistic about what we can achieve with this approach. That said, if it > turns out that a non-trivial number of C++ libraries are just C > libraries in disguise, something like this might "work". > > Maurizio > > > On 17/05/2023 03:29, Rel wrote: > > > Yes, shim lib will help to overcome many issues and create "decent" support > > > > but let's be honest it has issues: > > > > 1. Users (jextract) need to compile it > > 2. Users (jextract) need to package shim lib together with their application > > - and if they package it, then it means every time when application starts, they will need to unpackage it back and this will affect application startup time > > 4. Users need maintain shim libs for each platform > > > > And now imagine a user who have a C++ library from which they want to call single method X::f. It is not template, it is not inline, it is a simple method which symbol is present in the .so library itself. > > > > Current FLOW using FFM for creating such X.java would be like: > > - define X layout > > - write a binding for X ctor > > - write binding for X::f > > - copy comments to the method if any > > > > What is good in this FLOW is that users don't need to deal with all side effects of shim lib listed above. And they don't have to, as long as they don't use any "static" features from that C++ library. > > > > I thought that for C++, jextract can define bindings for what is possible (without any shim lib). Later, if users decide that they need some extra "static" features of C++ library, they can bring shim lib (or even use JavaCPP from the start). > > > > Looking at what Maurizio shared it seems that we can let jextract already to automate the FLOW above and extract bindings for what is possible (just writing X layout alone, manually, may not be an easy thing to do). > > > > ------- Original Message ------- > > On Tuesday, May 16th, 2023 at 10:52 AM, Maurizio Cimadamore maurizio.cimadamore at oracle.com wrote: > > > > > Hi > > > I'd describe more C++ as a sort of ongoing exploration at the moment > > > (but, our priorities lie in the finalization of the FFM API). > > > > > > Adding some basic support for it is doable - name mangling isn't (as > > > Manuel says) the biggest concern - after all, libclang gives us all the > > > correct mangled name, so it's easy to generate a downcall method handle > > > targeting a mangled symbol name, but expose it as a "nice-looking" > > > source-like name. > > > > > > A very basic PoC which adds some C++ support can be found here [1]. This > > > is the result of half a day of hacking on the jextract code, so it is by > > > no means complete. I'm sharing it here mostly for "educational > > > purposes", so that I can talk about what I learned from it :-) While "it > > > works", as noted, there are many things that leave to be desired: > > > > > > * templates do not work correctly > > > * dynamic dispatch is not supported > > > * everything that is "inline" doesn't work > > > * (probably way more stuff, like exceptions, etc.) > > > > > > Some (all?) these limitations are shared across all the tools which > > > share a similar approach - e.g. Rust's bindgen [2]. > > > > > > My personal feeling is that C++ is too much of a stretch for an approach > > > that targets C++ directly (as done in my patch). As John has noted in > > > this document [3], adding "decent" support for C++ would require > > > jextract to generate a shim library on the side, which would help Java > > > clients perform complex C++ operations which either rely on the > > > compiler, or the runtime (or both). > > > > > > There might be more than one way to emit this shim library - one would > > > be to actually compile it and then add a dependency on it from the > > > generated binidngs (that's the JavaCPP [4] approach). Another approach > > > could be to embed compiled code, in some way, directly into the bindings > > > themselves - then at runtime turn the compiled code into a memory > > > segment, and make it executable. That seems more complex, and I'm not > > > sure if worth it (but wanted to list the option for completeness). In > > > that spirit, I also note how there exist some macro assembler options > > > written using FFM API [5] which might (or not!) play a role in the > > > translation strategy. Again, mostly jotting some thoughts. > > > > > > No matter which approach is chosen, I think one of the first problem > > > which would need to address is some way to "lower" a C++ library into > > > plain C, so as to automate the generation of this shim library (which we > > > can then link against using FFM API). And, while there have been many > > > experiments in this area over the years, I didn't come across anything > > > that seemed "up to date", or directly usable from us. So perhaps I'd > > > suggest to start from there? Note that that could even be a separate > > > tool (which then you run jextract against, as usual). > > > > > > [1] - > > > https://urldefense.com/v3/__https://github.com/openjdk/jextract/compare/panama...mcimadamore:jextract:cxx?expand=1__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlEADQ734$ > > > [2] - https://urldefense.com/v3/__https://rust-lang.github.io/rust-bindgen/cpp.html__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vlh_MvP2g$ > > > [3] - https://cr.openjdk.org/~jrose/panama/cppapi.cpp.txt > > > [4] - https://urldefense.com/v3/__https://github.com/bytedeco/javacpp__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlxBXb8Xg$ > > > [5] - https://urldefense.com/v3/__https://github.com/YaSuenag/ffmasm__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vl8q86rgc$ > > > > > > On 15/05/2023 02:25, Rel wrote: > > > > > > > Hi, > > > > > > > > I would like to know how to participate in C++ support for jextract. > > > > Watching Project Panama video > > > > (https://urldefense.com/v3/__https://inside.java/2023/04/18/levelup-panama/__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11VlC5Rk9qg$ ), Paul mentioned that > > > > C++ is in the plans. > > > > Do we have someone working on it already so I can syncup on what is > > > > the plan and where I can help? > > > > In particular: > > > > - will it be part of jextract or may be jextract++? > > > > - will it use clang or something else? if clang then which interface > > > > https://urldefense.com/v3/__https://clang.llvm.org/docs/Tooling.html__;!!ACWV5N9M2RV99hQ!OzVQN5TeqwnfSExCk-NlMtucgoAtk-uLqKL-ssMDJAZlDirNwwQAUcPmxIz1lnrfQJ5RTMjiZonK11Vlubk4HkE$ > > > > > > > > There are many things to be done for C++ support but if I pick the > > > > most basic like symbols, in C++ they are mangled so current jextract > > > > linking logic will need to be changed. Do you think modifying > > > > NameMangler to store those mangled C++ symbols will be the right approach? > > > > > > > > Regards, From maurizio.cimadamore at oracle.com Mon May 22 09:14:00 2023 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 22 May 2023 10:14:00 +0100 Subject: jextract C++ support In-Reply-To: References: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> <9997b556-0cd9-b390-e110-bf37cdddca39@oracle.com> Message-ID: On 22/05/2023 04:12, Rel wrote: >> But I believe some more robust >> analysis should be made to understand exactly how many APIs can be >> supported in this "simple" fashion. > Yes, I started to gather such analysis herehttps://urldefense.com/v3/__https://github.com/enatai/panamaexperiments__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_KlWNYiU$ > Currently there is only one happy case [https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_K7Nl50c$ ] which is Point2d class from your foo.hpp file. This is not too surprising - after all the hacky changes I shared were built around that example. What I meant for "robust analysis" was to try and establish how many _real-world_ C++ library can really be tackled in such a direct approach. My feeling is "not many" - but I don't have any hard data to back up this claim. Maurizio -------------- next part -------------- An HTML attachment was scrubbed... URL: From enatai at proton.me Tue May 23 04:11:34 2023 From: enatai at proton.me (Rel) Date: Tue, 23 May 2023 04:11:34 +0000 Subject: jextract C++ support In-Reply-To: References: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> <9997b556-0cd9-b390-e110-bf37cdddca39@oracle.com> Message-ID: <1ru-2D5hsy2KC3KU3GHuxq3ST_SzlZfAuiMjdPjex4aOmVVAD8oiJWPJ2egkRChVNONNGGgGSEfJxGQ-ZAbQaJ_8OzP9_z6LtOExpVbvU-0=@proton.me> > What I meant for "robust analysis" was to try and establish how many _real-world_ C++ library can really be tackled in such a direct approach. Ohh I see now, I am affraid we know the answer for this :) Let's imagine if number of C++ libraries which can be covered end-to-end with "simple" approach is 0, does it mean that we should discard it and only focus on shim for binding all kinds of APIs? What about those cases which can be easily extracted using "simple" approach, like Point2d? Because I thought that we would like to do "analysis" of what C++ use cases can/cannot be covered with "simple" approach. For example from your previous message I see that we are not completely sure about exceptions: > * (probably way more stuff, like exceptions, etc.) Similarly for myself I would like to see what are the problems with "dynamic dispatch". I added it to "unhappy" for now, because we expect it not to work, but I plan to test it to see what are the issues there and share here. Similarly with that anyone would be able to reproduce and see same results. I guess my question now is: do we think it may be useful to know exactly how many C++ use cases can be covered with "simple" FFM approach. And if answer is yes, then we can use panamaexperiments as a playground where we can have tests for what is covered. This (possibly?) can give us more confidence in limitations of "simple" approach and how far we can go with it (and this can be easily demonstrated to everyone just by running those tests) Ideas? ------- Original Message ------- On Monday, May 22nd, 2023 at 9:14 AM, Maurizio Cimadamore wrote: > On 22/05/2023 04:12, Rel wrote: > >>> But I believe some more robust >>> analysis should be made to understand exactly how many APIs can be >>> supported in this "simple" fashion. >> >> Yes, I started to gather such analysis here >> https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_KlWNYiU$ >> Currently there is only one happy case [ >> https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_K7Nl50c$ >> ] which is Point2d class from your foo.hpp file. > > This is not too surprising - after all the hacky changes I shared were built around that example. > > What I meant for "robust analysis" was to try and establish how many _real-world_ C++ library can really be tackled in such a direct approach. My feeling is "not many" - but I don't have any hard data to back up this claim. > > Maurizio -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Tue May 23 08:58:19 2023 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Tue, 23 May 2023 09:58:19 +0100 Subject: jextract C++ support In-Reply-To: <1ru-2D5hsy2KC3KU3GHuxq3ST_SzlZfAuiMjdPjex4aOmVVAD8oiJWPJ2egkRChVNONNGGgGSEfJxGQ-ZAbQaJ_8OzP9_z6LtOExpVbvU-0=@proton.me> References: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> <9997b556-0cd9-b390-e110-bf37cdddca39@oracle.com> <1ru-2D5hsy2KC3KU3GHuxq3ST_SzlZfAuiMjdPjex4aOmVVAD8oiJWPJ2egkRChVNONNGGgGSEfJxGQ-ZAbQaJ_8OzP9_z6LtOExpVbvU-0=@proton.me> Message-ID: <1997fa0d-2e51-db7c-1cd3-b6898cf31b62@oracle.com> On 23/05/2023 05:11, Rel wrote: > > What I meant for "robust analysis" was to try and establish how many _real-world_ C++ library can really be tackled in > such a direct approach. > > Ohh I see now, I am affraid we know the answer for this :) > > Let's imagine if number of C++ libraries which can be covered > end-to-end with "simple" approach is 0, does it mean that we should > discard it and only focus on shim for binding all kinds of APIs? What > about those cases which can be easily extracted using "simple" > approach, like Point2d? Perhaps we should reach out to the Rust community? Their binding generator adopts the same simple approach as the one I showed in the patch. Given how hard it is to support C++ (because the underlying libclang C API is not very solid in that respect), I'd be surprised if they maintained all the necessary code just for stuff like Point2d? > > Because I thought that we would like to do "analysis" of what C++ use > cases can/cannot be covered with "simple" approach. For example from > your previous message I see that we are not completely sure about > exceptions: > > > * (probably way more stuff, like exceptions, etc.) > > Similarly for myself I would like to see what are the problems with > "dynamic dispatch". I added it to "unhappy" for now, because we expect > it not to work, but I plan to test it to see what are the issues there > and share here. Similarly with that anyone would be able to reproduce > and see same results. > > I guess my question now is: do we think it may be useful to know > exactly how many C++ use cases can be covered with "simple" FFM > approach. And if answer is yes, then we can use panamaexperiments as a > playground where we can have tests for what is covered. This > (possibly?) can give us more confidence in limitations of "simple" > approach and how far we can go with it (and this can be easily > demonstrated to everyone just by running those tests) I think it would be useful to know some answer to that question, yes. My intuition tells me that there are probably two kinds of C++ libraries: those who were born that way, and those that moved over from being simpler C libraries. One such example in the latter category is OpenCV [1]. While its "core" header [2] declares an exception, as well as a bunch of classes, eyeballing it, it doesn't seem "too" problematic? Perhaps that would be a good point where to start, and, if a library such as that can be used with some degree of success, perhaps we can expand the search to other similar libraries. [1] - https://opencv.org/ [2] - https://github.com/opencv/opencv/blob/4.x/modules/core/include/opencv2/core.hpp > > Ideas? > > ------- Original Message ------- > On Monday, May 22nd, 2023 at 9:14 AM, Maurizio Cimadamore > wrote: > >> >> On 22/05/2023 04:12, Rel wrote: >>>> But I believe some more robust >>>> analysis should be made to understand exactly how many APIs can be >>>> supported in this "simple" fashion. >>> Yes, I started to gather such analysis herehttps://urldefense.com/v3/__https://github.com/enatai/panamaexperiments__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_KlWNYiU$ >>> Currently there is only one happy case [https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_K7Nl50c$ ] which is Point2d class from your foo.hpp file. >> >> This is not too surprising - after all the hacky changes I shared >> were built around that example. >> >> What I meant for "robust analysis" was to try and establish how many >> _real-world_ C++ library can really be tackled in such a direct >> approach. My feeling is "not many" - but I don't have any hard data >> to back up this claim. >> >> Maurizio >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mcimadamore at openjdk.org Wed May 24 15:53:37 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Wed, 24 May 2023 15:53:37 GMT Subject: RFR: 7903475: Jextract should use new byte-based layout methods Message-ID: This PR tweaks jextract to work with the API changes in the layout API introduced in https://github.com/openjdk/jdk/pull/14013. ------------- Commit messages: - Initial push Changes: https://git.openjdk.org/jextract/pull/120/files Webrev: https://webrevs.openjdk.org/?repo=jextract&pr=120&range=00 Issue: https://bugs.openjdk.org/browse/CODETOOLS-7903475 Stats: 97 lines in 28 files changed: 3 ins; 0 del; 94 mod Patch: https://git.openjdk.org/jextract/pull/120.diff Fetch: git fetch https://git.openjdk.org/jextract.git pull/120/head:pull/120 PR: https://git.openjdk.org/jextract/pull/120 From mcimadamore at openjdk.org Wed May 24 15:53:39 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Wed, 24 May 2023 15:53:39 GMT Subject: RFR: 7903475: Jextract should use new byte-based layout methods In-Reply-To: References: Message-ID: On Wed, 24 May 2023 15:45:03 GMT, Maurizio Cimadamore wrote: > This PR tweaks jextract to work with the API changes in the layout API introduced in https://github.com/openjdk/jdk/pull/14013. src/main/java/org/openjdk/jextract/clang/libclang/CXString.java line 48: > 46: Constants$root.C_POINTER$LAYOUT.withName("data"), > 47: Constants$root.C_INT$LAYOUT.withName("private_flags"), > 48: MemoryLayout.paddingLayout(4) Note: all the changes in this folder are to jextract generated files. ------------- PR Review Comment: https://git.openjdk.org/jextract/pull/120#discussion_r1204422148 From enatai at proton.me Sun May 28 23:20:56 2023 From: enatai at proton.me (Rel) Date: Sun, 28 May 2023 23:20:56 +0000 Subject: jextract C++ support In-Reply-To: <1997fa0d-2e51-db7c-1cd3-b6898cf31b62@oracle.com> References: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> <9997b556-0cd9-b390-e110-bf37cdddca39@oracle.com> <1ru-2D5hsy2KC3KU3GHuxq3ST_SzlZfAuiMjdPjex4aOmVVAD8oiJWPJ2egkRChVNONNGGgGSEfJxGQ-ZAbQaJ_8OzP9_z6LtOExpVbvU-0=@proton.me> <1997fa0d-2e51-db7c-1cd3-b6898cf31b62@oracle.com> Message-ID: dynamic dispatch I tried following example [https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp] and it works fine as long as we generate proper Java bindings. See test for it [https://github.com/enatai/panamaexperiments/blob/main/cppexperiments/src/test/java/cppexperiments/HappyTests.java#L36] Please let me know which dynamic dispatch use cases you are concerned with. Because this one seems works fine. std::string I totally forgot that such basic type like std::string in C++ is a template. But it seems possible to call functions which operate with string objects because symbols for them are present: 00000000000013ad T _ZN7unhappy10helloWorldB5cxx11Ev std::string helloWorld(); I guess it is possible to create/extract layout for std::string using FFM but: - how to initialize this layout from Java? we cannot just call std::string constructor for it, right? - this layout may differ between different C++ runtimes (libstdc++ etc). MS C++ may have not same std::string layout as GCC > Their binding generator adopts the same simple approach as the one I showed in the patch. I will take a look ------- Original Message ------- On Tuesday, May 23rd, 2023 at 8:58 AM, Maurizio Cimadamore wrote: > On 23/05/2023 05:11, Rel wrote: > >>> What I meant for "robust analysis" was to try and establish how many _real-world_ C++ library can really be tackled in such a direct approach. >> >> Ohh I see now, I am affraid we know the answer for this :) >> >> Let's imagine if number of C++ libraries which can be covered end-to-end with "simple" approach is 0, does it mean that we should discard it and only focus on shim for binding all kinds of APIs? What about those cases which can be easily extracted using "simple" approach, like Point2d? > > Perhaps we should reach out to the Rust community? Their binding generator adopts the same simple approach as the one I showed in the patch. Given how hard it is to support C++ (because the underlying libclang C API is not very solid in that respect), I'd be surprised if they maintained all the necessary code just for stuff like Point2d? > >> Because I thought that we would like to do "analysis" of what C++ use cases can/cannot be covered with "simple" approach. For example from your previous message I see that we are not completely sure about exceptions: >> >>> * (probably way more stuff, like exceptions, etc.) >> >> Similarly for myself I would like to see what are the problems with "dynamic dispatch". I added it to "unhappy" for now, because we expect it not to work, but I plan to test it to see what are the issues there and share here. Similarly with that anyone would be able to reproduce and see same results. >> >> I guess my question now is: do we think it may be useful to know exactly how many C++ use cases can be covered with "simple" FFM approach. And if answer is yes, then we can use panamaexperiments as a playground where we can have tests for what is covered. This (possibly?) can give us more confidence in limitations of "simple" approach and how far we can go with it (and this can be easily demonstrated to everyone just by running those tests) > > I think it would be useful to know some answer to that question, yes. My intuition tells me that there are probably two kinds of C++ libraries: those who were born that way, and those that moved over from being simpler C libraries. One such example in the latter category is OpenCV [1]. While its "core" header [2] declares an exception, as well as a bunch of classes, eyeballing it, it doesn't seem "too" problematic? Perhaps that would be a good point where to start, and, if a library such as that can be used with some degree of success, perhaps we can expand the search to other similar libraries. > > [1] - https://opencv.org/ > [2] - https://github.com/opencv/opencv/blob/4.x/modules/core/include/opencv2/core.hpp > >> Ideas? >> >> ------- Original Message ------- >> On Monday, May 22nd, 2023 at 9:14 AM, Maurizio Cimadamore [](mailto:maurizio.cimadamore at oracle.com) wrote: >> >>> On 22/05/2023 04:12, Rel wrote: >>> >>>>> But I believe some more robust >>>>> analysis should be made to understand exactly how many APIs can be >>>>> supported in this "simple" fashion. >>>> >>>> Yes, I started to gather such analysis here >>>> https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_KlWNYiU$ >>>> Currently there is only one happy case [ >>>> https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_K7Nl50c$ >>>> ] which is Point2d class from your foo.hpp file. >>> >>> This is not too surprising - after all the hacky changes I shared were built around that example. >>> >>> What I meant for "robust analysis" was to try and establish how many _real-world_ C++ library can really be tackled in such a direct approach. My feeling is "not many" - but I don't have any hard data to back up this claim. >>> >>> Maurizio -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Mon May 29 09:00:34 2023 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Mon, 29 May 2023 10:00:34 +0100 Subject: jextract C++ support In-Reply-To: References: <837c7a24-4018-6d91-88f4-22312f19e2c3@oracle.com> <9997b556-0cd9-b390-e110-bf37cdddca39@oracle.com> <1ru-2D5hsy2KC3KU3GHuxq3ST_SzlZfAuiMjdPjex4aOmVVAD8oiJWPJ2egkRChVNONNGGgGSEfJxGQ-ZAbQaJ_8OzP9_z6LtOExpVbvU-0=@proton.me> <1997fa0d-2e51-db7c-1cd3-b6898cf31b62@oracle.com> Message-ID: <8797d776-b75b-e813-ad61-9b95dcf3b9dc@oracle.com> On 29/05/2023 00:20, Rel wrote: > dynamic dispatch > > I tried following example > [https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp > ] > and it works fine as long as we generate proper Java bindings. > > See test for it > [https://github.com/enatai/panamaexperiments/blob/main/cppexperiments/src/test/java/cppexperiments/HappyTests.java#L36 > ] > > Please let me know which dynamic dispatch use cases you are concerned > with. Because this one seems works fine. Sorry, I can see that working fine because you declare a "static" function which accepts a point, so the vtable indirection is generated by the CPP compiler in that function. What I'm worried about is calling virtual methods on classes. E.g. calling your "distance" function directly. Jextract gives you two possibilities: Point2d::distance and Point3d::distance. If you pass a Point3d object to Point2d::distance you will "only" get Point2d::distance to be called (as if there was no dynamic dispatch). > > std::string > I totally forgot that such basic type like std::string in C++ is a > template. > But it seems possible to call functions which operate with string > objects because symbols for them are present: > > 00000000000013ad T _ZN7unhappy10helloWorldB5cxx11Ev > > std::string helloWorld(); > > I guess it is possible to create/extract layout for std::string using > FFM but: > - how to initialize this layout from Java? we cannot just call > std::string constructor for it, right? > - this layout may differ between different C++ runtimes (libstdc++ > etc). MS C++ may have not same std::string layout as GCC On the latter, e.g. layout difference, this is no different than anything else with jextract. E.g. each jextract run is platform-dependent, as it pulls in header files that are heavily influenced by the platform and OS you run on. If I understand correctly, "string" is the "instantiation" of a template in C++. (e.g. some basic_string). That instantiation is fully defined (e.g. not partial), and I believe it should be possible, with libclang, to obtain more information about it - such as the layout etc. (for partial template instantiation, my understanding, reading on what Rust bindgen does is that it is not possible to handle them with libclang). So, ideally, we should be able to construct a layout for basic_string, and then pass that to the constructor, yes. Maurizio > > > Their binding generator adopts the same simple approach as the one I showed in the patch. > > I will take a look > > > ------- Original Message ------- > On Tuesday, May 23rd, 2023 at 8:58 AM, Maurizio Cimadamore > wrote: > >> >> On 23/05/2023 05:11, Rel wrote: >>> > What I meant for "robust analysis" was to try and establish how many _real-world_ C++ >>> library can really be tackled in such a direct approach. >>> >>> Ohh I see now, I am affraid we know the answer for this :) >>> >>> Let's imagine if number of C++ libraries which can be covered >>> end-to-end with "simple" approach is 0, does it mean that we should >>> discard it and only focus on shim for binding all kinds of APIs? >>> What about those cases which can be easily extracted using "simple" >>> approach, like Point2d? >> Perhaps we should reach out to the Rust community? Their binding >> generator adopts the same simple approach as the one I showed in the >> patch. Given how hard it is to support C++ (because the underlying >> libclang C API is not very solid in that respect), I'd be surprised >> if they maintained all the necessary code just for stuff like Point2d? >>> >>> Because I thought that we would like to do "analysis" of what C++ >>> use cases can/cannot be covered with "simple" approach. For example >>> from your previous message I see that we are not completely sure >>> about exceptions: >>> >>> > * (probably way more stuff, like exceptions, etc.) >>> >>> Similarly for myself I would like to see what are the problems with >>> "dynamic dispatch". I added it to "unhappy" for now, because we >>> expect it not to work, but I plan to test it to see what are the >>> issues there and share here. Similarly with that anyone would be >>> able to reproduce and see same results. >>> >>> I guess my question now is: do we think it may be useful to know >>> exactly how many C++ use cases can be covered with "simple" FFM >>> approach. And if answer is yes, then we can use panamaexperiments as >>> a playground where we can have tests for what is covered. This >>> (possibly?) can give us more confidence in limitations of "simple" >>> approach and how far we can go with it (and this can be easily >>> demonstrated to everyone just by running those tests) >> >> I think it would be useful to know some answer to that question, yes. >> My intuition tells me that there are probably two kinds of C++ >> libraries: those who were born that way, and those that moved over >> from being simpler C libraries. One such example in the latter >> category is OpenCV [1]. While its "core" header [2] declares an >> exception, as well as a bunch of classes, eyeballing it, it doesn't >> seem "too" problematic? Perhaps that would be a good point where to >> start, and, if a library such as that can be used with some degree of >> success, perhaps we can expand the search to other similar libraries. >> >> [1] - https://opencv.org/ >> [2] - >> https://github.com/opencv/opencv/blob/4.x/modules/core/include/opencv2/core.hpp >> >>> >>> Ideas? >>> >>> ------- Original Message ------- >>> On Monday, May 22nd, 2023 at 9:14 AM, Maurizio Cimadamore >>> wrote: >>> >>>> >>>> On 22/05/2023 04:12, Rel wrote: >>>>>> But I believe some more robust >>>>>> analysis should be made to understand exactly how many APIs can be >>>>>> supported in this "simple" fashion. >>>>> Yes, I started to gather such analysis herehttps://urldefense.com/v3/__https://github.com/enatai/panamaexperiments__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_KlWNYiU$ >>>>> Currently there is only one happy case [https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_K7Nl50c$ ] which is Point2d class from your foo.hpp file. >>>> >>>> This is not too surprising - after all the hacky changes I shared >>>> were built around that example. >>>> >>>> What I meant for "robust analysis" was to try and establish how >>>> many _real-world_ C++ library can really be tackled in such a >>>> direct approach. My feeling is "not many" - but I don't have any >>>> hard data to back up this claim. >>>> >>>> Maurizio >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jvernee at openjdk.org Tue May 30 11:44:19 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 30 May 2023 11:44:19 GMT Subject: RFR: 7903475: Jextract should use new byte-based layout methods In-Reply-To: References: Message-ID: On Wed, 24 May 2023 15:45:03 GMT, Maurizio Cimadamore wrote: > This PR tweaks jextract to work with the API changes in the layout API introduced in https://github.com/openjdk/jdk/pull/14013. Marked as reviewed by jvernee (Committer). ------------- PR Review: https://git.openjdk.org/jextract/pull/120#pullrequestreview-1450738329 From mcimadamore at openjdk.org Tue May 30 11:44:19 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Tue, 30 May 2023 11:44:19 GMT Subject: Integrated: 7903475: Jextract should use new byte-based layout methods In-Reply-To: References: Message-ID: On Wed, 24 May 2023 15:45:03 GMT, Maurizio Cimadamore wrote: > This PR tweaks jextract to work with the API changes in the layout API introduced in https://github.com/openjdk/jdk/pull/14013. This pull request has now been integrated. Changeset: 3f41267c Author: Maurizio Cimadamore URL: https://git.openjdk.org/jextract/commit/3f41267c654f3736b4595ab79f740d8b626869a9 Stats: 97 lines in 28 files changed: 3 ins; 0 del; 94 mod 7903475: Jextract should use new byte-based layout methods Reviewed-by: jvernee ------------- PR: https://git.openjdk.org/jextract/pull/120 From mcimadamore at openjdk.org Tue May 30 11:57:30 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Tue, 30 May 2023 11:57:30 GMT Subject: RFR: 7903481: Jextract doesn't enforce group layout alignment correctly in some cases Message-ID: This patch overhauls the treatment of pragma packs directives. The current logic tries to detect fields occurring at misaligned offsets, and relaxes alignment constraints for these fields. However, in cases like this: #pragma pack(push, 1) struct A { long long a; int b; } Each field is correctly aligned. But the struct size (12) is not a multiple of its natural alignment (8). As a result, we run into issues when building a sequence layout out of this struct, because of the eager checks added to the layout API. This patch fixes support for packed structs "the right way", that is, by asking clang the struct/union alignment, and then making sure that any field is aligned accordingly before the group layout is created. ------------- Commit messages: - Initial push Changes: https://git.openjdk.org/jextract/pull/121/files Webrev: https://webrevs.openjdk.org/?repo=jextract&pr=121&range=00 Issue: https://bugs.openjdk.org/browse/CODETOOLS-7903481 Stats: 105 lines in 9 files changed: 79 ins; 6 del; 20 mod Patch: https://git.openjdk.org/jextract/pull/121.diff Fetch: git fetch https://git.openjdk.org/jextract.git pull/121/head:pull/121 PR: https://git.openjdk.org/jextract/pull/121 From mcimadamore at openjdk.org Tue May 30 11:57:30 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Tue, 30 May 2023 11:57:30 GMT Subject: RFR: 7903481: Jextract doesn't enforce group layout alignment correctly in some cases In-Reply-To: References: Message-ID: On Tue, 30 May 2023 11:49:05 GMT, Maurizio Cimadamore wrote: > This patch overhauls the treatment of pragma packs directives. > > The current logic tries to detect fields occurring at misaligned offsets, and relaxes alignment constraints for these fields. > However, in cases like this: > > > #pragma pack(push, 1) > struct A { > long long a; > int b; > } > > > Each field is correctly aligned. But the struct size (12) is not a multiple of its natural alignment (8). As a result, we run into issues when building a sequence layout out of this struct, because of the eager checks added to the layout API. > > This patch fixes support for packed structs "the right way", that is, by asking clang the struct/union alignment, and then making sure that any field is aligned accordingly before the group layout is created. Note, even with this fix, windows.h does not extract correctly with the latest `panama` branch. This seems to be caused by a bug in the logic with which we visit record fields - either because of a bug in jextract, or because of an issue in libclang itself. Essentially, it is possible for jextract to see struct fields in the "wrong order", which then completely messes up our offset computation logic. More investigation is required to fix that case (esp. to understand if that's a latent issue in jextract code). This issue typically manifests with an exception when a struct layout is created (as the created struct layout has a size that does not conform to the size reported by libclang). src/main/java/org/openjdk/jextract/impl/TypeImpl.java line 376: > 374: try { > 375: return Optional.of(getLayoutInternal(t)); > 376: } catch (UnsupportedOperationException ex) { These changes are not strictly necessary, but I don't like that we swallow important exceptions and return empty optionals instead. ------------- PR Comment: https://git.openjdk.org/jextract/pull/121#issuecomment-1568300381 PR Review Comment: https://git.openjdk.org/jextract/pull/121#discussion_r1210157420 From jvernee at openjdk.org Tue May 30 12:04:17 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 30 May 2023 12:04:17 GMT Subject: RFR: 7903481: Jextract doesn't enforce group layout alignment correctly in some cases In-Reply-To: References: Message-ID: On Tue, 30 May 2023 11:49:05 GMT, Maurizio Cimadamore wrote: > This patch overhauls the treatment of pragma packs directives. > > The current logic tries to detect fields occurring at misaligned offsets, and relaxes alignment constraints for these fields. > However, in cases like this: > > > #pragma pack(push, 1) > struct A { > long long a; > int b; > } > > > Each field is correctly aligned. But the struct size (12) is not a multiple of its natural alignment (8). As a result, we run into issues when building a sequence layout out of this struct, because of the eager checks added to the layout API. > > This patch fixes support for packed structs "the right way", that is, by asking clang the struct/union alignment, and then making sure that any field is aligned accordingly before the group layout is created. Marked as reviewed by jvernee (Committer). ------------- PR Review: https://git.openjdk.org/jextract/pull/121#pullrequestreview-1450774885 From mcimadamore at openjdk.org Tue May 30 14:41:27 2023 From: mcimadamore at openjdk.org (Maurizio Cimadamore) Date: Tue, 30 May 2023 14:41:27 GMT Subject: Integrated: 7903481: Jextract doesn't enforce group layout alignment correctly in some cases In-Reply-To: References: Message-ID: On Tue, 30 May 2023 11:49:05 GMT, Maurizio Cimadamore wrote: > This patch overhauls the treatment of pragma packs directives. > > The current logic tries to detect fields occurring at misaligned offsets, and relaxes alignment constraints for these fields. > However, in cases like this: > > > #pragma pack(push, 1) > struct A { > long long a; > int b; > } > > > Each field is correctly aligned. But the struct size (12) is not a multiple of its natural alignment (8). As a result, we run into issues when building a sequence layout out of this struct, because of the eager checks added to the layout API. > > This patch fixes support for packed structs "the right way", that is, by asking clang the struct/union alignment, and then making sure that any field is aligned accordingly before the group layout is created. This pull request has now been integrated. Changeset: c3deba2d Author: Maurizio Cimadamore URL: https://git.openjdk.org/jextract/commit/c3deba2d52f5d57e3c616fedc324d12cf545a77e Stats: 105 lines in 9 files changed: 79 ins; 6 del; 20 mod 7903481: Jextract doesn't enforce group layout alignment correctly in some cases Reviewed-by: jvernee ------------- PR: https://git.openjdk.org/jextract/pull/121