[foreign] RFC: Additional jextract filtering
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Tue Mar 12 17:34:20 UTC 2019
<snip>
> I think it's also important to keep the heuristic simple. Which
> hopefully also makes it easy and straightforward to manipulate. What I
> like about the current approach is that it's so straight forward. You
> give jextract a header file, and get back a complete binding. If you
> want to make it smaller you could add filters.
>
> I think some sort of automatic filtering (guess) would also need the
> ability to be turned off.
Yes on both points. On simplicity, I think it's important not just in
terms of implementation, but also in pedagogical terms (how hard it is
to explain what jextract does?)
> <snip>
> Well, you need more than just functions and global variables to use a
> library. In practice the header file gives what you need. We just need
> to find a good heuristic for filtering out the noise that comes with
> it. (I guess that problem also exists in the C/C++ world, maybe it's
> interesting to look at some solutions there?)
>
> Starting from the list of library symbols seems like and interesting
> idea to minimize the output, but imho the header file is the more
> trustworthy source to draw information from.
The problem with headers is that they include other headers and so all
header-based approaches will have, at some point, to ask: when do I stop
following dependencies?
>
> But, I think what we can definitely agree on is that a jextract
> transitively including a bunch of system headers in the output is
> undesirable.
Right - and again, while it might be simple to explain _why_ a certain
header has been pulled in, it could be totally surprising for an user to
see so many symbols being pulled in for even relatively simple libraries.
<snip>
> I don't feel so strongly about the dependency analysis. I think it
> falls short in too many cases, especially when we already have a
> hand-crafted set of dependencies, i.e. header files. I think the
> dependency analysis should really only be used to emit warnings or
> errors when things that are needed to generate a well-formed artifact
> are missing, and let the user decide how to deal with the problem.
> Though, before we have a good mechanism for including dependencies for
> jextract runs, it seems fine to automatically include the dependencies
> of the root set as well.
>
> The path-based filtering to determine a root set seems interesting to
> explore. It should be an overridable default imho. I'll continue
> exploring that.
Maybe we're using different terms - by dependency analysis I mean
finding some root set of symbols to extract, and then use some analysis
to pull in the symbols that will be required at runtime.
Here's a possible sketch:
1) collect all function symbols in a given shared library
2) collect the set of headers H defining the functions in (1)
3) for each headerfile h in H, repeat these steps until the set H is stable
b) for all function symbols, scan the signature of the function and pull
in extra headers in H
c) for all struct symbols, scan the struct field signatures and pull in
extra headers in H
4) add all symbols in H to the result set (including macros, enums,
typedefs, ...)
What do you think?
Maurizio
> Jorn
>
>>
>> Maurizio
>>
>>
>>>
>>> Jorn
>>>
>>>> Maurizio
>>>>
>>>>>
>>>>> On the other hand, not everything makes sense to use from a Panama
>>>>> perspective, so we still need some escape hatch to filter out some
>>>>> stuff we can't use, or breaks the binder. But, we'd like to go
>>>>> about that disciplined, and make sure we don't filter out things
>>>>> that are required by other things, so we use a dependency set.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Jorn
>>>>>
>>>>> Maurizio Cimadamore schreef op 2019-03-11 16:12:
>>>>>> On 11/03/2019 13:45, Jorn Vernee wrote:
>>>>>>> I can separate the parts of the patch a little bit into; Filter
>>>>>>> refactor + root set compute, and then leave the option changes
>>>>>>> out of it. But those 2 alone do not affect the filtering, since
>>>>>>> the root set is only used when filtering non-symbol/macro elements.
>>>>>>
>>>>>> I guess then what I'm suggesting is to automatically filter out
>>>>>> elements not in the root set, and see how that works out.
>>>>>>
>>>>>> Maurizio
More information about the panama-dev
mailing list