[foreign] RFR 8220544: Jextract; library dependent header filtering

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Mar 14 13:24:26 UTC 2019


I think we need Sundar's input on this; I'm afraid that this change can 
upset TypedefHandler... e.g. given this:

typedef struct BAR* FOO;

I think it will attempt to redefine the struct tree for BAR with the 
name FOO, which is not what we want...

Probably this argues for a separate field... (or for typeDefinition() to 
accept a 'recurse' flag of some kind).

Maurizio

On 14/03/2019 12:45, Jorn Vernee wrote:
> Updated webrev: 
> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.04/
>
> Changed Treemaker implementation to recurse to the 'root' type when 
> creating TypdefTree::def, and using that in DependencyFinder.
>
> Jorn
>
> Jorn Vernee schreef op 2019-03-14 12:49:
>> Oh ok, sorry. I'll try that out.
>>
>> Jorn
>>
>> Maurizio Cimadamore schreef op 2019-03-14 12:43:
>>> On 14/03/2019 11:40, Jorn Vernee wrote:
>>>> I see what you mean.
>>>>
>>>> Well, `struct BAR` has a definition cursor, but `struct BAR*` does 
>>>> not have a definition/declaration, it's a type that's automatically 
>>>> derived from `struct BAR`. So for `struct BAR*` there is no good 
>>>> value for TypedefTree::def.
>>>
>>> I know that there's no good value for struct BAR* - I was wondering if
>>> the parser should detect the pointer, and recurse down, then store
>>> 'struct BAR' into 'def'.
>>>
>>> Maurizio
>>>
>>>>
>>>> TypedefTree::def seems to be a helper method mostly targeted at 
>>>> inline type definitions (looking at TypedefHandler [1]).
>>>>
>>>> I think for my use case using TypedefTree.type().canonicalType() is 
>>>> good enough, but I guess we could query that once in the parser and 
>>>> then store it in a field in TypedefTree?
>>>>
>>>> Jorn
>>>>
>>>> [1] : 
>>>> http://hg.openjdk.java.net/panama/dev/file/43ed53f40957/src/jdk.jextract/share/classes/com/sun/tools/jextract/TypedefHandler.java#l107
>>>>
>>>> Maurizio Cimadamore schreef op 2019-03-14 11:57:
>>>>> On 14/03/2019 00:23, Jorn Vernee wrote:
>>>>>> Sorry, bad example. The actual failure case is:
>>>>>>
>>>>>>      struct BAR { int x; }
>>>>>>
>>>>>>      typedef struct BAR* FOO;
>>>>>>
>>>>>> The TreeMaker code looks at the canonical type of the typedef 
>>>>>> `struct BAR*` which is neither a type definition or declaration, 
>>>>>> so TypedefTree::def is empty. But, in the case of dependecy 
>>>>>> analysis, we still want to inspect that type, get the pointee 
>>>>>> type and add it to the root set.
>>>>>
>>>>> Right - and I think what I'm claiming is that this is probably some
>>>>> kind of a parser bug? That is, if Typedef::def (we should probably
>>>>> find another name, 'def' is not good enough IMHO) is set correctly 
>>>>> for
>>>>>
>>>>> typedef struct BAR FOO;
>>>>>
>>>>> it should also be set for
>>>>>
>>>>> typedef struct BAR* FOO;
>>>>>
>>>>> That is, I claim that from the typedef tree you should always see a
>>>>> referenced type declaration w/o using the clang API.
>>>>>
>>>>> Maurizio
>>>>>
>>>>>>
>>>>>> Jorn
>>>>>>
>>>>>> Jorn Vernee schreef op 2019-03-14 01:14:
>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 01:04:
>>>>>>>> So, if I understand correctly, the issue is that if you have:
>>>>>>>>
>>>>>>>> typedef struct { int x; } FOO;
>>>>>>>>
>>>>>>>> you want the struct to be added to the root set. Which make sense.
>>>>>>>>
>>>>>>>> Now, with respect to getUnderlyingType - I'm a bit at odds; it 
>>>>>>>> seems
>>>>>>>> to me that the jextract tree should have the right info, under
>>>>>>>> TypedefTree::def, at least that is set by the parser for
>>>>>>>> declarations/definitions.
>>>>>>>
>>>>>>> That exact case is covered by TypedefTree::def, but it doesn't 
>>>>>>> cover a
>>>>>>> case like this:
>>>>>>>
>>>>>>>     struct BAR { int x; }
>>>>>>>
>>>>>>>     typedef struct BAR FOO;
>>>>>>>
>>>>>>> i.e. TypedefTree::def is only set when the typedef contains an 
>>>>>>> inline
>>>>>>> definition or a prototype. But, if the type is just referenced,
>>>>>>> TypedefTree::def is an empty Optional.
>>>>>>>
>>>>>>>> Wouldn't visiting TypedefTree::def (if present) be enough (and 
>>>>>>>> more in
>>>>>>>> sync with the rest of jextract) ? After all, that's the kind of 
>>>>>>>> thing
>>>>>>>> you wanna be sure not to miss by the dependency analysis -
>>>>>>>> getUnderlyingType seems a tad too noisy?
>>>>>>>
>>>>>>> This is what I had before, but it fell short in some cases.
>>>>>>>
>>>>>>>> In other words, let's say that there's a typedef in an header of a
>>>>>>>> library you want to jextract which is related, via a chain of 
>>>>>>>> other 42
>>>>>>>> typedefs to something in some obscure system library. Are you 
>>>>>>>> sure you
>>>>>>>> want the resulting artifact to have all the 42 intermediate 
>>>>>>>> typedefs?
>>>>>>>> After all, the library you wanna extract has just defined 'its
>>>>>>>> version' of the name for that particular construct... so it's very
>>>>>>>> likely that users of that library will prefer that name to the 
>>>>>>>> other
>>>>>>>> 42 ones?
>>>>>>>
>>>>>>> 42 intermediate typedefs seems a bit extreme...
>>>>>>>
>>>>>>> But, I think you have a point. Putting the typedef and the 
>>>>>>> canonical
>>>>>>> type into the root set is probably enough.
>>>>>>>
>>>>>>> Jorn
>>>>>>>
>>>>>>>> Maurizio
>>>>>>>>
>>>>>>>> On 13/03/2019 23:31, Jorn Vernee wrote:
>>>>>>>>> Update webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.02/ 
>>>>>>>>> This fixes an issue where typedef underlying types were not 
>>>>>>>>> being added to the root set.
>>>>>>>>>
>>>>>>>>> It turns out I was using the wrong API to get the underlying 
>>>>>>>>> type, and the needed API didn't have a binding yet :)
>>>>>>>>>
>>>>>>>>> The changes with respect to the previous webrev;
>>>>>>>>>   * added Cursor.getTypedefDeclUnderlyingType() binding
>>>>>>>>>   * added TypdefTree.underlyingType() method
>>>>>>>>>   * changed DependencyFinder to use this method instead of 
>>>>>>>>> typeDefinition()
>>>>>>>>>   * added test case for the failure case (also simplified test 
>>>>>>>>> class to only run jextract once and reuse the result)
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Jorn
>>>>>>>>>
>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:43:
>>>>>>>>>> On 13/03/2019 21:36, Jorn Vernee wrote:
>>>>>>>>>>>> Good questions. When no filter patterns are specified that 
>>>>>>>>>>>> is the case
>>>>>>>>>>>> everything is outputted, but when filter patterns are 
>>>>>>>>>>>> specified
>>>>>>>>>>>> headers are only added to the root set if they contain a 
>>>>>>>>>>>> symbol that
>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>
>>>>>>>>>>> FWIW, basically headers with symbols that pass both types of 
>>>>>>>>>>> filter are added to the root set. There's 1.) Filtereing 
>>>>>>>>>>> based on whether a symbol is found in a library, and 2.) 
>>>>>>>>>>> based on the used filter patterns (--include-symbols & 
>>>>>>>>>>> --exclude-symbols). I thought it would be better to exclude 
>>>>>>>>>>> a header from the root set if all the symbols in it were 
>>>>>>>>>>> filtered out by the filter patterns.
>>>>>>>>>>
>>>>>>>>>> Yep - I think that sounds sensible
>>>>>>>>>>
>>>>>>>>>> Maurizio
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Jorn
>>>>>>>>>>>
>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 22:30:
>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:08:
>>>>>>>>>>>>> Hi Jorn,
>>>>>>>>>>>>> this looks very good indeed. I'd like to play with it a 
>>>>>>>>>>>>> bit before we
>>>>>>>>>>>>> go ahead with this, if that's ok.
>>>>>>>>>>>>
>>>>>>>>>>>> Please do!
>>>>>>>>>>>>
>>>>>>>>>>>>> Also, one question: if the user does not specify "-l" what 
>>>>>>>>>>>>> happens? I
>>>>>>>>>>>>> guess the root set will then contain all the headers and 
>>>>>>>>>>>>> we just get
>>>>>>>>>>>>> back the full output (which would be ok) ?
>>>>>>>>>>>>
>>>>>>>>>>>> Good questions. When no filter patterns are specified that 
>>>>>>>>>>>> is the case
>>>>>>>>>>>> everything is outputted, but when filter patterns are 
>>>>>>>>>>>> specified
>>>>>>>>>>>> headers are only added to the root set if they contain a 
>>>>>>>>>>>> symbol that
>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe we should just fall back to 'include-everything' when 
>>>>>>>>>>>> no "-l"
>>>>>>>>>>>> options are specified, regardless of filter?
>>>>>>>>>>>>
>>>>>>>>>>>> Jorn
>>>>>>>>>>>>
>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 13/03/2019 18:06, Jorn Vernee wrote:
>>>>>>>>>>>>>> Updated webrev: 
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.01/ 
>>>>>>>>>>>>>> Fixed the bug, also added tests & ran all the examples 
>>>>>>>>>>>>>> again to try and find other bugs (nothing found).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:50:
>>>>>>>>>>>>>>> Discovered a bug in this after posting :(
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> DependencyFinder also needs to look for pointee and 
>>>>>>>>>>>>>>> array element type
>>>>>>>>>>>>>>> cursors (which it currently doesn't). I've tested a 
>>>>>>>>>>>>>>> working fix, but
>>>>>>>>>>>>>>> will also add some tests for the particular cases. 
>>>>>>>>>>>>>>> Updated webrev
>>>>>>>>>>>>>>> coming soon™
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:21:
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> After some discussion:
>>>>>>>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-March/004828.html 
>>>>>>>>>>>>>>>> I've implemented the discussed approach to header & 
>>>>>>>>>>>>>>>> declaration filtering.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8220544
>>>>>>>>>>>>>>>> Webrev: 
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.00/ 
>>>>>>>>>>>>>>>> There are a few new implementation classes:
>>>>>>>>>>>>>>>>   * PattertnFilter; a helper class that encapsulates 
>>>>>>>>>>>>>>>> the Pattern based
>>>>>>>>>>>>>>>> filtering code previously found in SymbolFilter. This 
>>>>>>>>>>>>>>>> code had to be
>>>>>>>>>>>>>>>> reused.
>>>>>>>>>>>>>>>>   * Filters; a Context sub-component holding 
>>>>>>>>>>>>>>>> PatternFilters for
>>>>>>>>>>>>>>>> different types of declarations.
>>>>>>>>>>>>>>>>   * DependencyFinder; finds dependencies of the root 
>>>>>>>>>>>>>>>> set of
>>>>>>>>>>>>>>>> declarations found in the root headers.
>>>>>>>>>>>>>>>>   * ElementFilter; filters out non-library symbol 
>>>>>>>>>>>>>>>> elements based on
>>>>>>>>>>>>>>>> whether they appear in a root header or are required by 
>>>>>>>>>>>>>>>> something
>>>>>>>>>>>>>>>> therein.
>>>>>>>>>>>>>>>>   * RootSet; a helper class for managing root headers + 
>>>>>>>>>>>>>>>> required element Trees.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I also encountered some crashes when debug printing 
>>>>>>>>>>>>>>>> Trees, which I've
>>>>>>>>>>>>>>>> fixed/worked around accordingly.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> FWIW, I tried this out with the Windows registry API 
>>>>>>>>>>>>>>>> and it's working
>>>>>>>>>>>>>>>> very nicely! Going from well over 100 headers to under 
>>>>>>>>>>>>>>>> 10. Some room
>>>>>>>>>>>>>>>> for improvement still exists; some empty headers and 
>>>>>>>>>>>>>>>> static forwarders
>>>>>>>>>>>>>>>> could still be omitted, but imho that should be handled 
>>>>>>>>>>>>>>>> by a separate
>>>>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Jorn


More information about the panama-dev mailing list