Namespace for inner classes in jextract

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Mar 11 15:05:52 UTC 2021


Interesting - I was looking at the documentation of Rust's bindgen - 
which provides an option to "blocklist" a symbol:

https://rust-lang.github.io/rust-bindgen/blocklisting.html

Turns out, there is a real sue case for generating everything but leave 
out some bits (even if those bits are referred by other symbols) - and 
that is to allow clients to define "their own version" of a given binding.

This is an interesting observation, which I think moves the design space 
towards a "warn but generate" solution (rather than to avoid generating 
symbols that depend on the symbol that is not imported).

As a validation, I also looked at other bindgen options, such as 
allowlisting:

https://rust-lang.github.io/rust-bindgen/allowlisting.html

Which works similarly to what the proposed --sym does - except that it 
also pulls in all the dependencies. This seems a bit redundant - that 
is, if a client of jextract wants to be in 100% control of what gets 
generated and opts for --sym, then, instead of having an option for 
excluding a symbol, I think it's better to simply emit the --sym. Of 
course there might be cases where the users want to extract things as 
normal (e.g. no --sym), _except for a specific type_ - but I think 
saying that, if you have this requirement, you fall in the "advanced 
filtering bucket" is likely to be a good 80/20 approximation.

As expected and noted in one of my previous emails, allowlisting 
unsurprisingly allows for different kind of symbols to be included 
(functions, types and vars). I think the reality of the messiness of C's 
namespace makes this probably a must (we have already seen enough cases 
of "struct has same name as function" etc. - but I might be suffering 
from name clash PTSD :-) ).

Maurizio



On 11/03/2021 14:26, Maurizio Cimadamore wrote:
> And, (as I'm recalling all the things we went through with previous 
> incarnation of similar ideas) - dependencies.
>
> you could have:
>
> typedef struct Foo Bar;
>
> What happens if you --sym Bar but not Foo?
>
> Previously, we had some dependency tracking system which tried to pull 
> in as many symbols as required (e.g. if you wanted function "baz" it 
> will pull in all stuff required by "baz" so that the binding would be 
> correct). This is theoretically possible, but adds a lot of complexity 
> to how jextract works.
>
> At the very least we would need to allow parsing the header into the 
> intermediate IR, with some "dummy" nodes corresponding to the symbols 
> that are not wanted - and then later on, do a pass to prune the IR so 
> that any node that depends, directly or indirectly on a pruned symbol 
> is removed too (and a warning printed about what's happening).
>
> Maurizio
>
> On 11/03/2021 13:55, Duncan Gittins wrote:
>> On 11/03/2021 12:19, Maurizio Cimadamore wrote:
>>>
>>> On 11/03/2021 10:01, Duncan Gittins wrote:
>>>> In my view all that is missing is command line options for multiple 
>>>> "--sym symbol_to_keep" to a pick name to keep in the generation 
>>>> process, and "--writeconfig myconfig.new" to write current run 
>>>> options to file (for editing / split / re-submit) => a neat 
>>>> formatted list of all params used by jextract PLUS list of all 
>>>> symbols in the run. 
>>>
>>> This is a simple and nice idea - but it doesn't deal with your 
>>> desire to give custom names to extracted symbols (which you brought 
>>> up earlier).
>>
>> The filter improvement / --sym equivalent is a "must" because the 
>> effect of not doing it is immediately noticable by a larger 
>> proportion of those developers that try out jextract on Windows headers.
>>
>> Whereas the need for custom names of duplicate case-switched types 
>> affects a small proportion of cases, so I'd rate it as "should/could" 
>> (not essential in first release). There is a workaround for such 
>> rarer cases by inlining the struct/layout into the application code 
>> as raw foreign API calls - should the developer want to.
>>
>>> There's some validation to do when it comes to duplicate names - for 
>>> instance, struct names can clash with typedef names, etc. which 
>>> probably suggests that the syntax of the option will have to be more 
>>> convoluted.
>>
>> Yes and no. It doesn't change the situation that is already present 
>> with --filter for 2 headers that pulls out types with a clash, -sym 
>> is just a finer degree of control of the in or out decision.
>>
>>>
>>> Maurizio
>>>
>>>
>>>
>>>
>>


More information about the panama-dev mailing list