[foreign] RFR 8220544: Jextract; library dependent header filtering

Sundararajan Athijegannathan sundararajan.athijegannathan at oracle.com
Tue Mar 19 02:41:45 UTC 2019


I tested this patch on Mac. All tests and samples are fine on Mac.

PS. I'm yet to review the code changes.

-Sundar

On 18/03/19, 4:28 PM, Jorn Vernee wrote:
> Hi,
>
> I have implemented this: 
> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.06/
>
> This adds 2 options --include-headers and --exclude-headers that take 
> regular expression patterns as an argument. The patterns are matched 
> against a cursor's full header file path to determine if it appears in 
> the header root set.
>
> For example, I can use the following to include headers in the Python 
> include director;
>
>     --include-headers "J:\\Python27\\include\\.*"
>
> ---
>
> To recap, we now:
>   1.) Filter functions, vars, and macros by whether they appear in a 
> root header.
>   2.) Filter functions, vars, and macros by the 
> --includes/exclude-symbols patterns.
>   3.) Filter functions and vars, by whether they appear in any shared 
> library.
>   4.) For the remaining trees in the root headers do dependency 
> analysis, and find the required structs, enums, and typedefs.
>   5.) For structs, enums, and typedefs include them if they appear in 
> a root header OR are required as dependency.
>
> Cheers,
> Jorn
>
> Maurizio Cimadamore schreef op 2019-03-15 19:29:
>> On 15/03/2019 18:21, Jorn Vernee wrote:
>>> Heh - was writing an email pretty much with this suggestion :P
>>>
>>> I think this is a good idea! If you don't think this adds to much 
>>> complexity to the options?
>>>
>>> So we'd basically have:
>>>
>>>   Without --include/exclude-header -> same behavior we have right now.
>>>   With --include/exclude-header -> Add included headers to root set 
>>> + find any dependencies, and put both in the generated artifact.
>>>
>>> I like that it does no filtering by default, so ad-hoc users don't 
>>> have to figure out which header files define which functions.
>>
>> Yep - I like this too. It goes towards the 'no magic' goal, and it
>> adds the filter option that makes most sense when you want to chop
>> down an API for good. This is a filtering mechanism after all, the
>> fact that it's coming back as a filtering option is IMHO, a plus, no a
>> minus.
>>
>> Btw, slight revision to what I said previously - let's make the
>> argument of include/exclude regexes too (as for others). This way it
>> can be even easier to use, most headers do have some commonalities in
>> them - and I like that, with that,  we can basically do the path-based
>> heuristics w/o hardwiring that into jextract.
>>
>> Maurizio
>>
>>>
>>> Jorn
>>>
>>> Maurizio Cimadamore schreef op 2019-03-15 19:07:
>>>> On 15/03/2019 18:01, Maurizio Cimadamore wrote:
>>>>>
>>>>> On 15/03/2019 17:48, Jorn Vernee wrote:
>>>>>> I've implemented this, and now doing a clean build based on the 
>>>>>> latest jdk/jdk merge before submitting the next webrev.
>>>>>>
>>>>>> It's working nicely, with one caveat; Some headers rely on 
>>>>>> pre-processor code defined in a parent header. For instance, 
>>>>>> Python.h defines some pre-processor code used in pythonrun.h . 
>>>>>> The example we have only uses functions defined in pythonrun.h, 
>>>>>> but if we just pass that header to jextract Clang throws an error 
>>>>>> because of the missing pre-processor code from Python.h
>>>>>>
>>>>>> In this case we have to pass Python.h first, and then pythonrun.h 
>>>>>> to get everything to work. This also relies on the existence of 
>>>>>> header include guards in the pythonrun.h header (since we're 
>>>>>> basically including it twice). A similar caveat exists with the 
>>>>>> Windows API.
>>>>>
>>>>> Doh! Maybe I've missed the simplest option after all.
>>>>>
>>>>> What if the root set was a 'first class' concept in the extraction 
>>>>> run, rather than something we infer from this or than command line 
>>>>> option?
>>>>>
>>>>> That way we could point jextract at python.h, but then say "hey, 
>>>>> I'm only really interested at stuff that comes from pythonrun.h".
>>>>
>>>> In other words, right now we have
>>>>
>>>> --include-symbol
>>>>
>>>> and
>>>>
>>>> --exclude-symbol
>>>>
>>>> If not set, everything is included.
>>>>
>>>> Maybe all we need is:
>>>>
>>>> --include-header <header>
>>>>
>>>> --exclude-header <header>
>>>>
>>>> And, again, if none is specified, all headers are part of the 'root 
>>>> set'.
>>>>
>>>> And, to address a concern you had - yes, I'd consider using the
>>>> 'include path' for header names in --include/exclude-header
>>>>
>>>> Maurizio
>>>>
>>>>>
>>>>> Maurizio
>>>>>
>>>>>>
>>>>>> So, the guide-line is: Pass the main header first, then internal 
>>>>>> headers. e.g. If we have a main header A.h which includes 
>>>>>> a_impl.h, and another main header B.h which includes b_impl.h, 
>>>>>> headers should be passed to jextract in the order: A.h a_impl.h 
>>>>>> B.h b_impl.h
>>>>>>
>>>>>> I think the behavior being dependent on the ordering of the 
>>>>>> headers could be fixed by sorting the headers in topological 
>>>>>> order. But, there's still the requirement to pass all of them, or 
>>>>>> things break.
>>>>>>
>>>>>> Any thoughs about that?
>>>>>>
>>>>>> Thanks,
>>>>>> Jorn
>>>>>>
>>>>>>
>>>>>> Jorn Vernee schreef op 2019-03-15 14:40:
>>>>>>> I've already been using shell scripts mostly when running jextract
>>>>>>> (except for simple examples). I find it very useful to be able to
>>>>>>> split the command over multiple lines, especially long file paths
>>>>>>> become much more readable.
>>>>>>>
>>>>>>> I'll start working on this then.
>>>>>>>
>>>>>>> Jorn
>>>>>>>
>>>>>>> Maurizio Cimadamore schreef op 2019-03-15 14:21:
>>>>>>>> On 15/03/2019 13:15, Jorn Vernee wrote:
>>>>>>>>> I still like this approach, and I think adding support for 
>>>>>>>>> wildcard patterns and/or header filters would make it better.
>>>>>>>>>
>>>>>>>>> Like you said: It's dead simple. What you pass to jextract is 
>>>>>>>>> what you get. Though, we could still apply dependency 
>>>>>>>>> analysis, which would make sure nothing that's needed gets 
>>>>>>>>> dropped.
>>>>>>>>
>>>>>>>> Why don't we try this then?
>>>>>>>>
>>>>>>>> Note that in javac we often use this trick:
>>>>>>>>
>>>>>>>> javac `find <path> -name *.java`
>>>>>>>>
>>>>>>>> which works
>>>>>>>>
>>>>>>>> also, note that jextract also accepts a file with the @ syntax, 
>>>>>>>> etc.
>>>>>>>>
>>>>>>>> jextract @args.txt
>>>>>>>>
>>>>>>>> where args.txt is the command line (which could list all the 
>>>>>>>> headers you want!).
>>>>>>>>
>>>>>>>> Maybe this is indeed the simpler approach.
>>>>>>>>
>>>>>>>> Maurizio


More information about the panama-dev mailing list