[foreign] RFR 8220544: Jextract; library dependent header filtering

Jorn Vernee jbvernee at xs4all.nl
Thu Mar 14 15:39:43 UTC 2019


Thanks for running the tests,

I ran into a similar problem with enums with the OpenGL sample. Also, 
one of the samples was still using the `-m` flag. I've put those on my 
TODO list. I thought we'd probably get a chance to check the samples 
over before submitting the next snapshot.

Thanks,
Jorn

Sundararajan Athijegannathan schreef op 2019-03-14 15:34:
> I ran tests and samples on Mac OS. All fine -- except that there was
> unrelated sample breaking (preexisting independent of current patch).
> 
> Fixed panama_foreign.md file and .html file ->
> https://mail.openjdk.java.net/pipermail/panama-dev/2019-March/004896.html
> With the fix, all samples are fine.
> 
> PS. Yet to review the current patch though - just tested on Mac.
> 
> -Sundar
> 
> On 14/03/19, 7:57 PM, Maurizio Cimadamore wrote:
>> Hi Jorn,
>> I gave this a try with our OpenGL and Python examples. The extraction 
>> process works fine, and the generated jars are a bit smaller - but I 
>> still see a lot of system-related headers. Python has still lots of 
>> them, OpenGL less so, but even in the GL case, I think it's 
>> interesting, because I can't see references to types mentioned in 
>> these headers (nor in the resulting classfiles). Do you know, on top 
>> of your head, as to why this could be happening?
>> 
>> Maurizio
>> 
>> 
>> On 14/03/2019 12:45, Jorn Vernee wrote:
>>> Updated webrev: 
>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.04/
>>> 
>>> Changed Treemaker implementation to recurse to the 'root' type when 
>>> creating TypdefTree::def, and using that in DependencyFinder.
>>> 
>>> Jorn
>>> 
>>> Jorn Vernee schreef op 2019-03-14 12:49:
>>>> Oh ok, sorry. I'll try that out.
>>>> 
>>>> Jorn
>>>> 
>>>> Maurizio Cimadamore schreef op 2019-03-14 12:43:
>>>>> On 14/03/2019 11:40, Jorn Vernee wrote:
>>>>>> I see what you mean.
>>>>>> 
>>>>>> Well, `struct BAR` has a definition cursor, but `struct BAR*` does 
>>>>>> not have a definition/declaration, it's a type that's 
>>>>>> automatically derived from `struct BAR`. So for `struct BAR*` 
>>>>>> there is no good value for TypedefTree::def.
>>>>> 
>>>>> I know that there's no good value for struct BAR* - I was wondering 
>>>>> if
>>>>> the parser should detect the pointer, and recurse down, then store
>>>>> 'struct BAR' into 'def'.
>>>>> 
>>>>> Maurizio
>>>>> 
>>>>>> 
>>>>>> TypedefTree::def seems to be a helper method mostly targeted at 
>>>>>> inline type definitions (looking at TypedefHandler [1]).
>>>>>> 
>>>>>> I think for my use case using TypedefTree.type().canonicalType() 
>>>>>> is good enough, but I guess we could query that once in the parser 
>>>>>> and then store it in a field in TypedefTree?
>>>>>> 
>>>>>> Jorn
>>>>>> 
>>>>>> [1] : 
>>>>>> http://hg.openjdk.java.net/panama/dev/file/43ed53f40957/src/jdk.jextract/share/classes/com/sun/tools/jextract/TypedefHandler.java#l107
>>>>>> 
>>>>>> Maurizio Cimadamore schreef op 2019-03-14 11:57:
>>>>>>> On 14/03/2019 00:23, Jorn Vernee wrote:
>>>>>>>> Sorry, bad example. The actual failure case is:
>>>>>>>> 
>>>>>>>>      struct BAR { int x; }
>>>>>>>> 
>>>>>>>>      typedef struct BAR* FOO;
>>>>>>>> 
>>>>>>>> The TreeMaker code looks at the canonical type of the typedef 
>>>>>>>> `struct BAR*` which is neither a type definition or declaration, 
>>>>>>>> so TypedefTree::def is empty. But, in the case of dependecy 
>>>>>>>> analysis, we still want to inspect that type, get the pointee 
>>>>>>>> type and add it to the root set.
>>>>>>> 
>>>>>>> Right - and I think what I'm claiming is that this is probably 
>>>>>>> some
>>>>>>> kind of a parser bug? That is, if Typedef::def (we should 
>>>>>>> probably
>>>>>>> find another name, 'def' is not good enough IMHO) is set 
>>>>>>> correctly for
>>>>>>> 
>>>>>>> typedef struct BAR FOO;
>>>>>>> 
>>>>>>> it should also be set for
>>>>>>> 
>>>>>>> typedef struct BAR* FOO;
>>>>>>> 
>>>>>>> That is, I claim that from the typedef tree you should always see 
>>>>>>> a
>>>>>>> referenced type declaration w/o using the clang API.
>>>>>>> 
>>>>>>> Maurizio
>>>>>>> 
>>>>>>>> 
>>>>>>>> Jorn
>>>>>>>> 
>>>>>>>> Jorn Vernee schreef op 2019-03-14 01:14:
>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 01:04:
>>>>>>>>>> So, if I understand correctly, the issue is that if you have:
>>>>>>>>>> 
>>>>>>>>>> typedef struct { int x; } FOO;
>>>>>>>>>> 
>>>>>>>>>> you want the struct to be added to the root set. Which make 
>>>>>>>>>> sense.
>>>>>>>>>> 
>>>>>>>>>> Now, with respect to getUnderlyingType - I'm a bit at odds; it 
>>>>>>>>>> seems
>>>>>>>>>> to me that the jextract tree should have the right info, under
>>>>>>>>>> TypedefTree::def, at least that is set by the parser for
>>>>>>>>>> declarations/definitions.
>>>>>>>>> 
>>>>>>>>> That exact case is covered by TypedefTree::def, but it doesn't 
>>>>>>>>> cover a
>>>>>>>>> case like this:
>>>>>>>>> 
>>>>>>>>>     struct BAR { int x; }
>>>>>>>>> 
>>>>>>>>>     typedef struct BAR FOO;
>>>>>>>>> 
>>>>>>>>> i.e. TypedefTree::def is only set when the typedef contains an 
>>>>>>>>> inline
>>>>>>>>> definition or a prototype. But, if the type is just referenced,
>>>>>>>>> TypedefTree::def is an empty Optional.
>>>>>>>>> 
>>>>>>>>>> Wouldn't visiting TypedefTree::def (if present) be enough (and 
>>>>>>>>>> more in
>>>>>>>>>> sync with the rest of jextract) ? After all, that's the kind 
>>>>>>>>>> of thing
>>>>>>>>>> you wanna be sure not to miss by the dependency analysis -
>>>>>>>>>> getUnderlyingType seems a tad too noisy?
>>>>>>>>> 
>>>>>>>>> This is what I had before, but it fell short in some cases.
>>>>>>>>> 
>>>>>>>>>> In other words, let's say that there's a typedef in an header 
>>>>>>>>>> of a
>>>>>>>>>> library you want to jextract which is related, via a chain of 
>>>>>>>>>> other 42
>>>>>>>>>> typedefs to something in some obscure system library. Are you 
>>>>>>>>>> sure you
>>>>>>>>>> want the resulting artifact to have all the 42 intermediate 
>>>>>>>>>> typedefs?
>>>>>>>>>> After all, the library you wanna extract has just defined 'its
>>>>>>>>>> version' of the name for that particular construct... so it's 
>>>>>>>>>> very
>>>>>>>>>> likely that users of that library will prefer that name to the 
>>>>>>>>>> other
>>>>>>>>>> 42 ones?
>>>>>>>>> 
>>>>>>>>> 42 intermediate typedefs seems a bit extreme...
>>>>>>>>> 
>>>>>>>>> But, I think you have a point. Putting the typedef and the 
>>>>>>>>> canonical
>>>>>>>>> type into the root set is probably enough.
>>>>>>>>> 
>>>>>>>>> Jorn
>>>>>>>>> 
>>>>>>>>>> Maurizio
>>>>>>>>>> 
>>>>>>>>>> On 13/03/2019 23:31, Jorn Vernee wrote:
>>>>>>>>>>> Update webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.02/ 
>>>>>>>>>>> This fixes an issue where typedef underlying types were not 
>>>>>>>>>>> being added to the root set.
>>>>>>>>>>> 
>>>>>>>>>>> It turns out I was using the wrong API to get the underlying 
>>>>>>>>>>> type, and the needed API didn't have a binding yet :)
>>>>>>>>>>> 
>>>>>>>>>>> The changes with respect to the previous webrev;
>>>>>>>>>>>   * added Cursor.getTypedefDeclUnderlyingType() binding
>>>>>>>>>>>   * added TypdefTree.underlyingType() method
>>>>>>>>>>>   * changed DependencyFinder to use this method instead of 
>>>>>>>>>>> typeDefinition()
>>>>>>>>>>>   * added test case for the failure case (also simplified 
>>>>>>>>>>> test class to only run jextract once and reuse the result)
>>>>>>>>>>> 
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Jorn
>>>>>>>>>>> 
>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:43:
>>>>>>>>>>>> On 13/03/2019 21:36, Jorn Vernee wrote:
>>>>>>>>>>>>>> Good questions. When no filter patterns are specified that 
>>>>>>>>>>>>>> is the case
>>>>>>>>>>>>>> everything is outputted, but when filter patterns are 
>>>>>>>>>>>>>> specified
>>>>>>>>>>>>>> headers are only added to the root set if they contain a 
>>>>>>>>>>>>>> symbol that
>>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> FWIW, basically headers with symbols that pass both types 
>>>>>>>>>>>>> of filter are added to the root set. There's 1.) Filtereing 
>>>>>>>>>>>>> based on whether a symbol is found in a library, and 2.) 
>>>>>>>>>>>>> based on the used filter patterns (--include-symbols & 
>>>>>>>>>>>>> --exclude-symbols). I thought it would be better to exclude 
>>>>>>>>>>>>> a header from the root set if all the symbols in it were 
>>>>>>>>>>>>> filtered out by the filter patterns.
>>>>>>>>>>>> 
>>>>>>>>>>>> Yep - I think that sounds sensible
>>>>>>>>>>>> 
>>>>>>>>>>>> Maurizio
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 22:30:
>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:08:
>>>>>>>>>>>>>>> Hi Jorn,
>>>>>>>>>>>>>>> this looks very good indeed. I'd like to play with it a 
>>>>>>>>>>>>>>> bit before we
>>>>>>>>>>>>>>> go ahead with this, if that's ok.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Please do!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Also, one question: if the user does not specify "-l" 
>>>>>>>>>>>>>>> what happens? I
>>>>>>>>>>>>>>> guess the root set will then contain all the headers and 
>>>>>>>>>>>>>>> we just get
>>>>>>>>>>>>>>> back the full output (which would be ok) ?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Good questions. When no filter patterns are specified that 
>>>>>>>>>>>>>> is the case
>>>>>>>>>>>>>> everything is outputted, but when filter patterns are 
>>>>>>>>>>>>>> specified
>>>>>>>>>>>>>> headers are only added to the root set if they contain a 
>>>>>>>>>>>>>> symbol that
>>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Maybe we should just fall back to 'include-everything' 
>>>>>>>>>>>>>> when no "-l"
>>>>>>>>>>>>>> options are specified, regardless of filter?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 13/03/2019 18:06, Jorn Vernee wrote:
>>>>>>>>>>>>>>>> Updated webrev: 
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.01/ 
>>>>>>>>>>>>>>>> Fixed the bug, also added tests & ran all the examples 
>>>>>>>>>>>>>>>> again to try and find other bugs (nothing found).
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:50:
>>>>>>>>>>>>>>>>> Discovered a bug in this after posting :(
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> DependencyFinder also needs to look for pointee and 
>>>>>>>>>>>>>>>>> array element type
>>>>>>>>>>>>>>>>> cursors (which it currently doesn't). I've tested a 
>>>>>>>>>>>>>>>>> working fix, but
>>>>>>>>>>>>>>>>> will also add some tests for the particular cases. 
>>>>>>>>>>>>>>>>> Updated webrev
>>>>>>>>>>>>>>>>> coming soon™
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:21:
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> After some discussion:
>>>>>>>>>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-March/004828.html 
>>>>>>>>>>>>>>>>>> I've implemented the discussed approach to header & 
>>>>>>>>>>>>>>>>>> declaration filtering.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8220544
>>>>>>>>>>>>>>>>>> Webrev: 
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.00/ 
>>>>>>>>>>>>>>>>>> There are a few new implementation classes:
>>>>>>>>>>>>>>>>>>   * PattertnFilter; a helper class that encapsulates 
>>>>>>>>>>>>>>>>>> the Pattern based
>>>>>>>>>>>>>>>>>> filtering code previously found in SymbolFilter. This 
>>>>>>>>>>>>>>>>>> code had to be
>>>>>>>>>>>>>>>>>> reused.
>>>>>>>>>>>>>>>>>>   * Filters; a Context sub-component holding 
>>>>>>>>>>>>>>>>>> PatternFilters for
>>>>>>>>>>>>>>>>>> different types of declarations.
>>>>>>>>>>>>>>>>>>   * DependencyFinder; finds dependencies of the root 
>>>>>>>>>>>>>>>>>> set of
>>>>>>>>>>>>>>>>>> declarations found in the root headers.
>>>>>>>>>>>>>>>>>>   * ElementFilter; filters out non-library symbol 
>>>>>>>>>>>>>>>>>> elements based on
>>>>>>>>>>>>>>>>>> whether they appear in a root header or are required 
>>>>>>>>>>>>>>>>>> by something
>>>>>>>>>>>>>>>>>> therein.
>>>>>>>>>>>>>>>>>>   * RootSet; a helper class for managing root headers 
>>>>>>>>>>>>>>>>>> + required element Trees.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I also encountered some crashes when debug printing 
>>>>>>>>>>>>>>>>>> Trees, which I've
>>>>>>>>>>>>>>>>>> fixed/worked around accordingly.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> FWIW, I tried this out with the Windows registry API 
>>>>>>>>>>>>>>>>>> and it's working
>>>>>>>>>>>>>>>>>> very nicely! Going from well over 100 headers to under 
>>>>>>>>>>>>>>>>>> 10. Some room
>>>>>>>>>>>>>>>>>> for improvement still exists; some empty headers and 
>>>>>>>>>>>>>>>>>> static forwarders
>>>>>>>>>>>>>>>>>> could still be omitted, but imho that should be 
>>>>>>>>>>>>>>>>>> handled by a separate
>>>>>>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Jorn


More information about the panama-dev mailing list