Namespace for inner classes in jextract

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Mar 11 15:39:59 UTC 2021


I've captured this discussion, along with a concrete strawman proposal here:

https://bugs.openjdk.java.net/browse/JDK-8260976

Thanks
Maurizio

On 11/03/2021 15:05, Maurizio Cimadamore wrote:
> Interesting - I was looking at the documentation of Rust's bindgen - 
> which provides an option to "blocklist" a symbol:
>
> https://rust-lang.github.io/rust-bindgen/blocklisting.html
>
> Turns out, there is a real sue case for generating everything but 
> leave out some bits (even if those bits are referred by other symbols) 
> - and that is to allow clients to define "their own version" of a 
> given binding.
>
> This is an interesting observation, which I think moves the design 
> space towards a "warn but generate" solution (rather than to avoid 
> generating symbols that depend on the symbol that is not imported).
>
> As a validation, I also looked at other bindgen options, such as 
> allowlisting:
>
> https://rust-lang.github.io/rust-bindgen/allowlisting.html
>
> Which works similarly to what the proposed --sym does - except that it 
> also pulls in all the dependencies. This seems a bit redundant - that 
> is, if a client of jextract wants to be in 100% control of what gets 
> generated and opts for --sym, then, instead of having an option for 
> excluding a symbol, I think it's better to simply emit the --sym. Of 
> course there might be cases where the users want to extract things as 
> normal (e.g. no --sym), _except for a specific type_ - but I think 
> saying that, if you have this requirement, you fall in the "advanced 
> filtering bucket" is likely to be a good 80/20 approximation.
>
> As expected and noted in one of my previous emails, allowlisting 
> unsurprisingly allows for different kind of symbols to be included 
> (functions, types and vars). I think the reality of the messiness of 
> C's namespace makes this probably a must (we have already seen enough 
> cases of "struct has same name as function" etc. - but I might be 
> suffering from name clash PTSD :-) ).
>
> Maurizio
>
>
>
> On 11/03/2021 14:26, Maurizio Cimadamore wrote:
>> And, (as I'm recalling all the things we went through with previous 
>> incarnation of similar ideas) - dependencies.
>>
>> you could have:
>>
>> typedef struct Foo Bar;
>>
>> What happens if you --sym Bar but not Foo?
>>
>> Previously, we had some dependency tracking system which tried to 
>> pull in as many symbols as required (e.g. if you wanted function 
>> "baz" it will pull in all stuff required by "baz" so that the binding 
>> would be correct). This is theoretically possible, but adds a lot of 
>> complexity to how jextract works.
>>
>> At the very least we would need to allow parsing the header into the 
>> intermediate IR, with some "dummy" nodes corresponding to the symbols 
>> that are not wanted - and then later on, do a pass to prune the IR so 
>> that any node that depends, directly or indirectly on a pruned symbol 
>> is removed too (and a warning printed about what's happening).
>>
>> Maurizio
>>
>> On 11/03/2021 13:55, Duncan Gittins wrote:
>>> On 11/03/2021 12:19, Maurizio Cimadamore wrote:
>>>>
>>>> On 11/03/2021 10:01, Duncan Gittins wrote:
>>>>> In my view all that is missing is command line options for 
>>>>> multiple "--sym symbol_to_keep" to a pick name to keep in the 
>>>>> generation process, and "--writeconfig myconfig.new" to write 
>>>>> current run options to file (for editing / split / re-submit) => a 
>>>>> neat formatted list of all params used by jextract PLUS list of 
>>>>> all symbols in the run. 
>>>>
>>>> This is a simple and nice idea - but it doesn't deal with your 
>>>> desire to give custom names to extracted symbols (which you brought 
>>>> up earlier).
>>>
>>> The filter improvement / --sym equivalent is a "must" because the 
>>> effect of not doing it is immediately noticable by a larger 
>>> proportion of those developers that try out jextract on Windows 
>>> headers.
>>>
>>> Whereas the need for custom names of duplicate case-switched types 
>>> affects a small proportion of cases, so I'd rate it as 
>>> "should/could" (not essential in first release). There is a 
>>> workaround for such rarer cases by inlining the struct/layout into 
>>> the application code as raw foreign API calls - should the developer 
>>> want to.
>>>
>>>> There's some validation to do when it comes to duplicate names - 
>>>> for instance, struct names can clash with typedef names, etc. which 
>>>> probably suggests that the syntax of the option will have to be 
>>>> more convoluted.
>>>
>>> Yes and no. It doesn't change the situation that is already present 
>>> with --filter for 2 headers that pulls out types with a clash, -sym 
>>> is just a finer degree of control of the in or out decision.
>>>
>>>>
>>>> Maurizio
>>>>
>>>>
>>>>
>>>>
>>>


More information about the panama-dev mailing list