jextract C++ support
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Thu Aug 17 08:46:59 UTC 2023
Hi,
it seems like the binding generator is emitting bindings for private
string fields?
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/basic_string.h#L211
While PhantomData seems some magic Rust thingie:
https://doc.rust-lang.org/stable/std/marker/struct.PhantomData.html
Which is probably used to deal with lifetime of the string char array
(but this is a guess).
Maurizio
On 17/08/2023 05:14, Rel wrote:
> Hi,
>
> I start to look on rust-bindgen and did few experiments here:
>
> https://github.com/enatai/panamaexperiments/blob/main/rust-bindgen/README.md
> <https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/rust-bindgen/README.md__;!!ACWV5N9M2RV99hQ!JKkwWpDI9MsiWp4MJSA43oz6BbNqfK_Vikm4I5JiB87pf4lcNF8rWmmkvMYVHG1RYRG4-KJx-_sorV5wvpEAntI$>
>
> My question is around what rust-bindgen generated for std::string
>
> Mostly those are ordinary data fields for which jextract possibly can
> generate layout but I don't really understand purpose of:
>
> pub _phantom_0:
> ::std::marker::PhantomData<::std::cell::UnsafeCell<_CharT>>,
>
> and what alternative for that can be from jextract. Is it some special
> Rust feature, or?
> ------- Original Message -------
> On Monday, May 29th, 2023 at 9:00 AM, Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com> wrote:
>
>>
>> On 29/05/2023 00:20, Rel wrote:
>>> dynamic dispatch
>>>
>>> I tried following example
>>> [https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp
>>> <https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp__;!!ACWV5N9M2RV99hQ!JnMeRRKWI0A8qpGkiXzcR0f06AAFEP-dXJV8lT-ZAMcxwE3BhDeGj91EgKkeMudSckPxr_-N9AlNWSGCtyvUZL0$>]
>>> and it works fine as long as we generate proper Java bindings.
>>>
>>> See test for it
>>> [https://github.com/enatai/panamaexperiments/blob/main/cppexperiments/src/test/java/cppexperiments/HappyTests.java#L36
>>> <https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/cppexperiments/src/test/java/cppexperiments/HappyTests.java*L36__;Iw!!ACWV5N9M2RV99hQ!JnMeRRKWI0A8qpGkiXzcR0f06AAFEP-dXJV8lT-ZAMcxwE3BhDeGj91EgKkeMudSckPxr_-N9AlNWSGCRH2ufOA$>]
>>>
>>> Please let me know which dynamic dispatch use cases you are
>>> concerned with. Because this one seems works fine.
>>
>> Sorry, I can see that working fine because you declare a "static"
>> function which accepts a point, so the vtable indirection is
>> generated by the CPP compiler in that function.
>>
>> What I'm worried about is calling virtual methods on classes. E.g.
>> calling your "distance" function directly. Jextract gives you two
>> possibilities: Point2d::distance and Point3d::distance. If you pass a
>> Point3d object to Point2d::distance you will "only" get
>> Point2d::distance to be called (as if there was no dynamic dispatch).
>>
>>
>>>
>>> std::string
>>> I totally forgot that such basic type like std::string in C++ is a
>>> template.
>>> But it seems possible to call functions which operate with string
>>> objects because symbols for them are present:
>>>
>>> 00000000000013ad T _ZN7unhappy10helloWorldB5cxx11Ev
>>>
>>> std::string helloWorld();
>>>
>>> I guess it is possible to create/extract layout for std::string
>>> using FFM but:
>>> - how to initialize this layout from Java? we cannot just call
>>> std::string constructor for it, right?
>>> - this layout may differ between different C++ runtimes (libstdc++
>>> etc). MS C++ may have not same std::string layout as GCC
>>
>> On the latter, e.g. layout difference, this is no different than
>> anything else with jextract. E.g. each jextract run is
>> platform-dependent, as it pulls in header files that are heavily
>> influenced by the platform and OS you run on.
>>
>> If I understand correctly, "string" is the "instantiation" of a
>> template in C++. (e.g. some basic_string<char>). That instantiation
>> is fully defined (e.g. not partial), and I believe it should be
>> possible, with libclang, to obtain more information about it - such
>> as the layout etc. (for partial template instantiation, my
>> understanding, reading on what Rust bindgen does is that it is not
>> possible to handle them with libclang).
>>
>> So, ideally, we should be able to construct a layout for
>> basic_string<char>, and then pass that to the constructor, yes.
>>
>> Maurizio
>>
>>
>>>
>>> > Their binding generator adopts the same simple approach as the one I showed in the patch.
>>>
>>> I will take a look
>>>
>>>
>>> ------- Original Message -------
>>> On Tuesday, May 23rd, 2023 at 8:58 AM, Maurizio Cimadamore
>>> <maurizio.cimadamore at oracle.com> wrote:
>>>
>>>>
>>>> On 23/05/2023 05:11, Rel wrote:
>>>>> > What I meant for "robust analysis" was to try and establish how many _real-world_
>>>>> C++ library can really be tackled in such a direct approach.
>>>>>
>>>>> Ohh I see now, I am affraid we know the answer for this :)
>>>>>
>>>>> Let's imagine if number of C++ libraries which can be covered
>>>>> end-to-end with "simple" approach is 0, does it mean that we
>>>>> should discard it and only focus on shim for binding all kinds of
>>>>> APIs? What about those cases which can be easily extracted using
>>>>> "simple" approach, like Point2d?
>>>> Perhaps we should reach out to the Rust community? Their binding
>>>> generator adopts the same simple approach as the one I showed in
>>>> the patch. Given how hard it is to support C++ (because the
>>>> underlying libclang C API is not very solid in that respect), I'd
>>>> be surprised if they maintained all the necessary code just for
>>>> stuff like Point2d?
>>>>>
>>>>> Because I thought that we would like to do "analysis" of what C++
>>>>> use cases can/cannot be covered with "simple" approach. For
>>>>> example from your previous message I see that we are not
>>>>> completely sure about exceptions:
>>>>>
>>>>> > * (probably way more stuff, like exceptions, etc.)
>>>>>
>>>>> Similarly for myself I would like to see what are the problems
>>>>> with "dynamic dispatch". I added it to "unhappy" for now, because
>>>>> we expect it not to work, but I plan to test it to see what are
>>>>> the issues there and share here. Similarly with that anyone would
>>>>> be able to reproduce and see same results.
>>>>>
>>>>> I guess my question now is: do we think it may be useful to know
>>>>> exactly how many C++ use cases can be covered with "simple" FFM
>>>>> approach. And if answer is yes, then we can use panamaexperiments
>>>>> as a playground where we can have tests for what is covered. This
>>>>> (possibly?) can give us more confidence in limitations of "simple"
>>>>> approach and how far we can go with it (and this can be easily
>>>>> demonstrated to everyone just by running those tests)
>>>>
>>>> I think it would be useful to know some answer to that question,
>>>> yes. My intuition tells me that there are probably two kinds of C++
>>>> libraries: those who were born that way, and those that moved over
>>>> from being simpler C libraries. One such example in the latter
>>>> category is OpenCV [1]. While its "core" header [2] declares an
>>>> exception, as well as a bunch of classes, eyeballing it, it doesn't
>>>> seem "too" problematic? Perhaps that would be a good point where to
>>>> start, and, if a library such as that can be used with some degree
>>>> of success, perhaps we can expand the search to other similar
>>>> libraries.
>>>>
>>>> [1] - https://opencv.org/
>>>> [2] -
>>>> https://github.com/opencv/opencv/blob/4.x/modules/core/include/opencv2/core.hpp
>>>>
>>>>>
>>>>> Ideas?
>>>>>
>>>>> ------- Original Message -------
>>>>> On Monday, May 22nd, 2023 at 9:14 AM, Maurizio Cimadamore
>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>>
>>>>>>
>>>>>> On 22/05/2023 04:12, Rel wrote:
>>>>>>>> But I believe some more robust
>>>>>>>> analysis should be made to understand exactly how many APIs can be
>>>>>>>> supported in this "simple" fashion.
>>>>>>> Yes, I started to gather such analysis herehttps://urldefense.com/v3/__https://github.com/enatai/panamaexperiments__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_KlWNYiU$
>>>>>>> Currently there is only one happy case [https://urldefense.com/v3/__https://github.com/enatai/panamaexperiments/blob/main/libcppexperiments/src/main/public/happy.hpp__;!!ACWV5N9M2RV99hQ!KOxdK2qmmSzlaaO3kSlSUG2-ifWAVRD6OHlz9NHYvuggmy7NnnNxWvHYcxDm0Vn4gPXlbasjfC-ehIx_K7Nl50c$ ] which is Point2d class from your foo.hpp file.
>>>>>>
>>>>>> This is not too surprising - after all the hacky changes I shared
>>>>>> were built around that example.
>>>>>>
>>>>>> What I meant for "robust analysis" was to try and establish how
>>>>>> many _real-world_ C++ library can really be tackled in such a
>>>>>> direct approach. My feeling is "not many" - but I don't have any
>>>>>> hard data to back up this claim.
>>>>>>
>>>>>> Maurizio
>>>>>>
>>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jextract-dev/attachments/20230817/334155ab/attachment-0001.htm>
More information about the jextract-dev
mailing list