[foreign] RFR 8220544: Jextract; library dependent header filtering
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Thu Mar 14 14:55:29 UTC 2019
I think I got at the bottom of this. I tried debugging an extraction of
opengl, and it eventually ends up adding inttypes.h to the rootset
because of this symbol:
imaxabs
The library filter doesn't work here, because this is a function defined
in the set of default libraries that comes with the VM, so jextract
'thinks' this symbol is genuinely part of one of the libraries specified
in the command line - except is not.
I tried an hacky fix which checks for absence of symbol in
Libraries.getDefaultLibrary - and that seems to give the expected
results - but I leave this to your consideration.
Maurizio
On 14/03/2019 14:27, Maurizio Cimadamore wrote:
> Hi Jorn,
> I gave this a try with our OpenGL and Python examples. The extraction
> process works fine, and the generated jars are a bit smaller - but I
> still see a lot of system-related headers. Python has still lots of
> them, OpenGL less so, but even in the GL case, I think it's
> interesting, because I can't see references to types mentioned in
> these headers (nor in the resulting classfiles). Do you know, on top
> of your head, as to why this could be happening?
>
> Maurizio
>
>
> On 14/03/2019 12:45, Jorn Vernee wrote:
>> Updated webrev:
>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.04/
>>
>> Changed Treemaker implementation to recurse to the 'root' type when
>> creating TypdefTree::def, and using that in DependencyFinder.
>>
>> Jorn
>>
>> Jorn Vernee schreef op 2019-03-14 12:49:
>>> Oh ok, sorry. I'll try that out.
>>>
>>> Jorn
>>>
>>> Maurizio Cimadamore schreef op 2019-03-14 12:43:
>>>> On 14/03/2019 11:40, Jorn Vernee wrote:
>>>>> I see what you mean.
>>>>>
>>>>> Well, `struct BAR` has a definition cursor, but `struct BAR*` does
>>>>> not have a definition/declaration, it's a type that's
>>>>> automatically derived from `struct BAR`. So for `struct BAR*`
>>>>> there is no good value for TypedefTree::def.
>>>>
>>>> I know that there's no good value for struct BAR* - I was wondering if
>>>> the parser should detect the pointer, and recurse down, then store
>>>> 'struct BAR' into 'def'.
>>>>
>>>> Maurizio
>>>>
>>>>>
>>>>> TypedefTree::def seems to be a helper method mostly targeted at
>>>>> inline type definitions (looking at TypedefHandler [1]).
>>>>>
>>>>> I think for my use case using TypedefTree.type().canonicalType()
>>>>> is good enough, but I guess we could query that once in the parser
>>>>> and then store it in a field in TypedefTree?
>>>>>
>>>>> Jorn
>>>>>
>>>>> [1] :
>>>>> http://hg.openjdk.java.net/panama/dev/file/43ed53f40957/src/jdk.jextract/share/classes/com/sun/tools/jextract/TypedefHandler.java#l107
>>>>>
>>>>> Maurizio Cimadamore schreef op 2019-03-14 11:57:
>>>>>> On 14/03/2019 00:23, Jorn Vernee wrote:
>>>>>>> Sorry, bad example. The actual failure case is:
>>>>>>>
>>>>>>> struct BAR { int x; }
>>>>>>>
>>>>>>> typedef struct BAR* FOO;
>>>>>>>
>>>>>>> The TreeMaker code looks at the canonical type of the typedef
>>>>>>> `struct BAR*` which is neither a type definition or declaration,
>>>>>>> so TypedefTree::def is empty. But, in the case of dependecy
>>>>>>> analysis, we still want to inspect that type, get the pointee
>>>>>>> type and add it to the root set.
>>>>>>
>>>>>> Right - and I think what I'm claiming is that this is probably some
>>>>>> kind of a parser bug? That is, if Typedef::def (we should probably
>>>>>> find another name, 'def' is not good enough IMHO) is set
>>>>>> correctly for
>>>>>>
>>>>>> typedef struct BAR FOO;
>>>>>>
>>>>>> it should also be set for
>>>>>>
>>>>>> typedef struct BAR* FOO;
>>>>>>
>>>>>> That is, I claim that from the typedef tree you should always see a
>>>>>> referenced type declaration w/o using the clang API.
>>>>>>
>>>>>> Maurizio
>>>>>>
>>>>>>>
>>>>>>> Jorn
>>>>>>>
>>>>>>> Jorn Vernee schreef op 2019-03-14 01:14:
>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 01:04:
>>>>>>>>> So, if I understand correctly, the issue is that if you have:
>>>>>>>>>
>>>>>>>>> typedef struct { int x; } FOO;
>>>>>>>>>
>>>>>>>>> you want the struct to be added to the root set. Which make
>>>>>>>>> sense.
>>>>>>>>>
>>>>>>>>> Now, with respect to getUnderlyingType - I'm a bit at odds; it
>>>>>>>>> seems
>>>>>>>>> to me that the jextract tree should have the right info, under
>>>>>>>>> TypedefTree::def, at least that is set by the parser for
>>>>>>>>> declarations/definitions.
>>>>>>>>
>>>>>>>> That exact case is covered by TypedefTree::def, but it doesn't
>>>>>>>> cover a
>>>>>>>> case like this:
>>>>>>>>
>>>>>>>> struct BAR { int x; }
>>>>>>>>
>>>>>>>> typedef struct BAR FOO;
>>>>>>>>
>>>>>>>> i.e. TypedefTree::def is only set when the typedef contains an
>>>>>>>> inline
>>>>>>>> definition or a prototype. But, if the type is just referenced,
>>>>>>>> TypedefTree::def is an empty Optional.
>>>>>>>>
>>>>>>>>> Wouldn't visiting TypedefTree::def (if present) be enough (and
>>>>>>>>> more in
>>>>>>>>> sync with the rest of jextract) ? After all, that's the kind
>>>>>>>>> of thing
>>>>>>>>> you wanna be sure not to miss by the dependency analysis -
>>>>>>>>> getUnderlyingType seems a tad too noisy?
>>>>>>>>
>>>>>>>> This is what I had before, but it fell short in some cases.
>>>>>>>>
>>>>>>>>> In other words, let's say that there's a typedef in an header
>>>>>>>>> of a
>>>>>>>>> library you want to jextract which is related, via a chain of
>>>>>>>>> other 42
>>>>>>>>> typedefs to something in some obscure system library. Are you
>>>>>>>>> sure you
>>>>>>>>> want the resulting artifact to have all the 42 intermediate
>>>>>>>>> typedefs?
>>>>>>>>> After all, the library you wanna extract has just defined 'its
>>>>>>>>> version' of the name for that particular construct... so it's
>>>>>>>>> very
>>>>>>>>> likely that users of that library will prefer that name to the
>>>>>>>>> other
>>>>>>>>> 42 ones?
>>>>>>>>
>>>>>>>> 42 intermediate typedefs seems a bit extreme...
>>>>>>>>
>>>>>>>> But, I think you have a point. Putting the typedef and the
>>>>>>>> canonical
>>>>>>>> type into the root set is probably enough.
>>>>>>>>
>>>>>>>> Jorn
>>>>>>>>
>>>>>>>>> Maurizio
>>>>>>>>>
>>>>>>>>> On 13/03/2019 23:31, Jorn Vernee wrote:
>>>>>>>>>> Update webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.02/
>>>>>>>>>> This fixes an issue where typedef underlying types were not
>>>>>>>>>> being added to the root set.
>>>>>>>>>>
>>>>>>>>>> It turns out I was using the wrong API to get the underlying
>>>>>>>>>> type, and the needed API didn't have a binding yet :)
>>>>>>>>>>
>>>>>>>>>> The changes with respect to the previous webrev;
>>>>>>>>>> * added Cursor.getTypedefDeclUnderlyingType() binding
>>>>>>>>>> * added TypdefTree.underlyingType() method
>>>>>>>>>> * changed DependencyFinder to use this method instead of
>>>>>>>>>> typeDefinition()
>>>>>>>>>> * added test case for the failure case (also simplified
>>>>>>>>>> test class to only run jextract once and reuse the result)
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Jorn
>>>>>>>>>>
>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:43:
>>>>>>>>>>> On 13/03/2019 21:36, Jorn Vernee wrote:
>>>>>>>>>>>>> Good questions. When no filter patterns are specified that
>>>>>>>>>>>>> is the case
>>>>>>>>>>>>> everything is outputted, but when filter patterns are
>>>>>>>>>>>>> specified
>>>>>>>>>>>>> headers are only added to the root set if they contain a
>>>>>>>>>>>>> symbol that
>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>
>>>>>>>>>>>> FWIW, basically headers with symbols that pass both types
>>>>>>>>>>>> of filter are added to the root set. There's 1.) Filtereing
>>>>>>>>>>>> based on whether a symbol is found in a library, and 2.)
>>>>>>>>>>>> based on the used filter patterns (--include-symbols &
>>>>>>>>>>>> --exclude-symbols). I thought it would be better to exclude
>>>>>>>>>>>> a header from the root set if all the symbols in it were
>>>>>>>>>>>> filtered out by the filter patterns.
>>>>>>>>>>>
>>>>>>>>>>> Yep - I think that sounds sensible
>>>>>>>>>>>
>>>>>>>>>>> Maurizio
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Jorn
>>>>>>>>>>>>
>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 22:30:
>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:08:
>>>>>>>>>>>>>> Hi Jorn,
>>>>>>>>>>>>>> this looks very good indeed. I'd like to play with it a
>>>>>>>>>>>>>> bit before we
>>>>>>>>>>>>>> go ahead with this, if that's ok.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please do!
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, one question: if the user does not specify "-l"
>>>>>>>>>>>>>> what happens? I
>>>>>>>>>>>>>> guess the root set will then contain all the headers and
>>>>>>>>>>>>>> we just get
>>>>>>>>>>>>>> back the full output (which would be ok) ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Good questions. When no filter patterns are specified that
>>>>>>>>>>>>> is the case
>>>>>>>>>>>>> everything is outputted, but when filter patterns are
>>>>>>>>>>>>> specified
>>>>>>>>>>>>> headers are only added to the root set if they contain a
>>>>>>>>>>>>> symbol that
>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Maybe we should just fall back to 'include-everything'
>>>>>>>>>>>>> when no "-l"
>>>>>>>>>>>>> options are specified, regardless of filter?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 13/03/2019 18:06, Jorn Vernee wrote:
>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.01/
>>>>>>>>>>>>>>> Fixed the bug, also added tests & ran all the examples
>>>>>>>>>>>>>>> again to try and find other bugs (nothing found).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:50:
>>>>>>>>>>>>>>>> Discovered a bug in this after posting :(
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> DependencyFinder also needs to look for pointee and
>>>>>>>>>>>>>>>> array element type
>>>>>>>>>>>>>>>> cursors (which it currently doesn't). I've tested a
>>>>>>>>>>>>>>>> working fix, but
>>>>>>>>>>>>>>>> will also add some tests for the particular cases.
>>>>>>>>>>>>>>>> Updated webrev
>>>>>>>>>>>>>>>> coming soon™
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:21:
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> After some discussion:
>>>>>>>>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-March/004828.html
>>>>>>>>>>>>>>>>> I've implemented the discussed approach to header &
>>>>>>>>>>>>>>>>> declaration filtering.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8220544
>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.00/
>>>>>>>>>>>>>>>>> There are a few new implementation classes:
>>>>>>>>>>>>>>>>> * PattertnFilter; a helper class that encapsulates
>>>>>>>>>>>>>>>>> the Pattern based
>>>>>>>>>>>>>>>>> filtering code previously found in SymbolFilter. This
>>>>>>>>>>>>>>>>> code had to be
>>>>>>>>>>>>>>>>> reused.
>>>>>>>>>>>>>>>>> * Filters; a Context sub-component holding
>>>>>>>>>>>>>>>>> PatternFilters for
>>>>>>>>>>>>>>>>> different types of declarations.
>>>>>>>>>>>>>>>>> * DependencyFinder; finds dependencies of the root
>>>>>>>>>>>>>>>>> set of
>>>>>>>>>>>>>>>>> declarations found in the root headers.
>>>>>>>>>>>>>>>>> * ElementFilter; filters out non-library symbol
>>>>>>>>>>>>>>>>> elements based on
>>>>>>>>>>>>>>>>> whether they appear in a root header or are required
>>>>>>>>>>>>>>>>> by something
>>>>>>>>>>>>>>>>> therein.
>>>>>>>>>>>>>>>>> * RootSet; a helper class for managing root headers
>>>>>>>>>>>>>>>>> + required element Trees.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I also encountered some crashes when debug printing
>>>>>>>>>>>>>>>>> Trees, which I've
>>>>>>>>>>>>>>>>> fixed/worked around accordingly.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> FWIW, I tried this out with the Windows registry API
>>>>>>>>>>>>>>>>> and it's working
>>>>>>>>>>>>>>>>> very nicely! Going from well over 100 headers to under
>>>>>>>>>>>>>>>>> 10. Some room
>>>>>>>>>>>>>>>>> for improvement still exists; some empty headers and
>>>>>>>>>>>>>>>>> static forwarders
>>>>>>>>>>>>>>>>> could still be omitted, but imho that should be
>>>>>>>>>>>>>>>>> handled by a separate
>>>>>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Jorn
More information about the panama-dev
mailing list