[foreign] RFR 8220544: Jextract; library dependent header filtering
Jorn Vernee
jbvernee at xs4all.nl
Thu Mar 14 16:08:59 UTC 2019
This produces test.class and test_h.class as expected, but nothing
else...
Used command is: `jextract -L . -l test test.h`
You would get stuff from string.h when the -l is omitted.
Jorn
Maurizio Cimadamore schreef op 2019-03-14 17:00:
> I think you can reproduce simply by having an header like this:
>
> #include <string.h>
>
> void foo(int i);
>
>
> if you extract that you would expect one symbol, but in reality you
> get also all the stuff from string.h.
>
> Maurizio
>
> On 14/03/2019 15:49, Jorn Vernee wrote:
>> The Windows jextract artifact for the OpenGL sample is a different
>> beast, because it relies on the standard Windows OpenGL libraries. So,
>> seeing some system headers is to be expected :) In the case of python
>> I'm also seeing some system headers, because python depends on a hand
>> full of corecrt types, so that's also correct. But, neither one seems
>> to be referencing imaxabs, so I'm not sure what's going on...
>>
>> I'll try to see if I can find a repro example on Windows to see about
>> a fix. I'm not sure I see how the SymbolFilter could get to the VM
>> libraries? Though I remember seeing a similar case in the past...
>>
>> Jorn
>>
>> Maurizio Cimadamore schreef op 2019-03-14 15:55:
>>> I think I got at the bottom of this. I tried debugging an extraction
>>> of opengl, and it eventually ends up adding inttypes.h to the rootset
>>> because of this symbol:
>>>
>>> imaxabs
>>>
>>> The library filter doesn't work here, because this is a function
>>> defined in the set of default libraries that comes with the VM, so
>>> jextract 'thinks' this symbol is genuinely part of one of the
>>> libraries specified in the command line - except is not.
>>>
>>> I tried an hacky fix which checks for absence of symbol in
>>> Libraries.getDefaultLibrary - and that seems to give the expected
>>> results - but I leave this to your consideration.
>>>
>>> Maurizio
>>>
>>> On 14/03/2019 14:27, Maurizio Cimadamore wrote:
>>>> Hi Jorn,
>>>> I gave this a try with our OpenGL and Python examples. The
>>>> extraction process works fine, and the generated jars are a bit
>>>> smaller - but I still see a lot of system-related headers. Python
>>>> has still lots of them, OpenGL less so, but even in the GL case, I
>>>> think it's interesting, because I can't see references to types
>>>> mentioned in these headers (nor in the resulting classfiles). Do you
>>>> know, on top of your head, as to why this could be happening?
>>>>
>>>> Maurizio
>>>>
>>>>
>>>> On 14/03/2019 12:45, Jorn Vernee wrote:
>>>>> Updated webrev:
>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.04/
>>>>>
>>>>> Changed Treemaker implementation to recurse to the 'root' type when
>>>>> creating TypdefTree::def, and using that in DependencyFinder.
>>>>>
>>>>> Jorn
>>>>>
>>>>> Jorn Vernee schreef op 2019-03-14 12:49:
>>>>>> Oh ok, sorry. I'll try that out.
>>>>>>
>>>>>> Jorn
>>>>>>
>>>>>> Maurizio Cimadamore schreef op 2019-03-14 12:43:
>>>>>>> On 14/03/2019 11:40, Jorn Vernee wrote:
>>>>>>>> I see what you mean.
>>>>>>>>
>>>>>>>> Well, `struct BAR` has a definition cursor, but `struct BAR*`
>>>>>>>> does not have a definition/declaration, it's a type that's
>>>>>>>> automatically derived from `struct BAR`. So for `struct BAR*`
>>>>>>>> there is no good value for TypedefTree::def.
>>>>>>>
>>>>>>> I know that there's no good value for struct BAR* - I was
>>>>>>> wondering if
>>>>>>> the parser should detect the pointer, and recurse down, then
>>>>>>> store
>>>>>>> 'struct BAR' into 'def'.
>>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>>>
>>>>>>>> TypedefTree::def seems to be a helper method mostly targeted at
>>>>>>>> inline type definitions (looking at TypedefHandler [1]).
>>>>>>>>
>>>>>>>> I think for my use case using TypedefTree.type().canonicalType()
>>>>>>>> is good enough, but I guess we could query that once in the
>>>>>>>> parser and then store it in a field in TypedefTree?
>>>>>>>>
>>>>>>>> Jorn
>>>>>>>>
>>>>>>>> [1] :
>>>>>>>> http://hg.openjdk.java.net/panama/dev/file/43ed53f40957/src/jdk.jextract/share/classes/com/sun/tools/jextract/TypedefHandler.java#l107
>>>>>>>>
>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 11:57:
>>>>>>>>> On 14/03/2019 00:23, Jorn Vernee wrote:
>>>>>>>>>> Sorry, bad example. The actual failure case is:
>>>>>>>>>>
>>>>>>>>>> struct BAR { int x; }
>>>>>>>>>>
>>>>>>>>>> typedef struct BAR* FOO;
>>>>>>>>>>
>>>>>>>>>> The TreeMaker code looks at the canonical type of the typedef
>>>>>>>>>> `struct BAR*` which is neither a type definition or
>>>>>>>>>> declaration, so TypedefTree::def is empty. But, in the case of
>>>>>>>>>> dependecy analysis, we still want to inspect that type, get
>>>>>>>>>> the pointee type and add it to the root set.
>>>>>>>>>
>>>>>>>>> Right - and I think what I'm claiming is that this is probably
>>>>>>>>> some
>>>>>>>>> kind of a parser bug? That is, if Typedef::def (we should
>>>>>>>>> probably
>>>>>>>>> find another name, 'def' is not good enough IMHO) is set
>>>>>>>>> correctly for
>>>>>>>>>
>>>>>>>>> typedef struct BAR FOO;
>>>>>>>>>
>>>>>>>>> it should also be set for
>>>>>>>>>
>>>>>>>>> typedef struct BAR* FOO;
>>>>>>>>>
>>>>>>>>> That is, I claim that from the typedef tree you should always
>>>>>>>>> see a
>>>>>>>>> referenced type declaration w/o using the clang API.
>>>>>>>>>
>>>>>>>>> Maurizio
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Jorn
>>>>>>>>>>
>>>>>>>>>> Jorn Vernee schreef op 2019-03-14 01:14:
>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 01:04:
>>>>>>>>>>>> So, if I understand correctly, the issue is that if you
>>>>>>>>>>>> have:
>>>>>>>>>>>>
>>>>>>>>>>>> typedef struct { int x; } FOO;
>>>>>>>>>>>>
>>>>>>>>>>>> you want the struct to be added to the root set. Which make
>>>>>>>>>>>> sense.
>>>>>>>>>>>>
>>>>>>>>>>>> Now, with respect to getUnderlyingType - I'm a bit at odds;
>>>>>>>>>>>> it seems
>>>>>>>>>>>> to me that the jextract tree should have the right info,
>>>>>>>>>>>> under
>>>>>>>>>>>> TypedefTree::def, at least that is set by the parser for
>>>>>>>>>>>> declarations/definitions.
>>>>>>>>>>>
>>>>>>>>>>> That exact case is covered by TypedefTree::def, but it
>>>>>>>>>>> doesn't cover a
>>>>>>>>>>> case like this:
>>>>>>>>>>>
>>>>>>>>>>> struct BAR { int x; }
>>>>>>>>>>>
>>>>>>>>>>> typedef struct BAR FOO;
>>>>>>>>>>>
>>>>>>>>>>> i.e. TypedefTree::def is only set when the typedef contains
>>>>>>>>>>> an inline
>>>>>>>>>>> definition or a prototype. But, if the type is just
>>>>>>>>>>> referenced,
>>>>>>>>>>> TypedefTree::def is an empty Optional.
>>>>>>>>>>>
>>>>>>>>>>>> Wouldn't visiting TypedefTree::def (if present) be enough
>>>>>>>>>>>> (and more in
>>>>>>>>>>>> sync with the rest of jextract) ? After all, that's the kind
>>>>>>>>>>>> of thing
>>>>>>>>>>>> you wanna be sure not to miss by the dependency analysis -
>>>>>>>>>>>> getUnderlyingType seems a tad too noisy?
>>>>>>>>>>>
>>>>>>>>>>> This is what I had before, but it fell short in some cases.
>>>>>>>>>>>
>>>>>>>>>>>> In other words, let's say that there's a typedef in an
>>>>>>>>>>>> header of a
>>>>>>>>>>>> library you want to jextract which is related, via a chain
>>>>>>>>>>>> of other 42
>>>>>>>>>>>> typedefs to something in some obscure system library. Are
>>>>>>>>>>>> you sure you
>>>>>>>>>>>> want the resulting artifact to have all the 42 intermediate
>>>>>>>>>>>> typedefs?
>>>>>>>>>>>> After all, the library you wanna extract has just defined
>>>>>>>>>>>> 'its
>>>>>>>>>>>> version' of the name for that particular construct... so
>>>>>>>>>>>> it's very
>>>>>>>>>>>> likely that users of that library will prefer that name to
>>>>>>>>>>>> the other
>>>>>>>>>>>> 42 ones?
>>>>>>>>>>>
>>>>>>>>>>> 42 intermediate typedefs seems a bit extreme...
>>>>>>>>>>>
>>>>>>>>>>> But, I think you have a point. Putting the typedef and the
>>>>>>>>>>> canonical
>>>>>>>>>>> type into the root set is probably enough.
>>>>>>>>>>>
>>>>>>>>>>> Jorn
>>>>>>>>>>>
>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>
>>>>>>>>>>>> On 13/03/2019 23:31, Jorn Vernee wrote:
>>>>>>>>>>>>> Update webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.02/
>>>>>>>>>>>>> This fixes an issue where typedef underlying types were not
>>>>>>>>>>>>> being added to the root set.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It turns out I was using the wrong API to get the
>>>>>>>>>>>>> underlying type, and the needed API didn't have a binding
>>>>>>>>>>>>> yet :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> The changes with respect to the previous webrev;
>>>>>>>>>>>>> * added Cursor.getTypedefDeclUnderlyingType() binding
>>>>>>>>>>>>> * added TypdefTree.underlyingType() method
>>>>>>>>>>>>> * changed DependencyFinder to use this method instead of
>>>>>>>>>>>>> typeDefinition()
>>>>>>>>>>>>> * added test case for the failure case (also simplified
>>>>>>>>>>>>> test class to only run jextract once and reuse the result)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>
>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:43:
>>>>>>>>>>>>>> On 13/03/2019 21:36, Jorn Vernee wrote:
>>>>>>>>>>>>>>>> Good questions. When no filter patterns are specified
>>>>>>>>>>>>>>>> that is the case
>>>>>>>>>>>>>>>> everything is outputted, but when filter patterns are
>>>>>>>>>>>>>>>> specified
>>>>>>>>>>>>>>>> headers are only added to the root set if they contain a
>>>>>>>>>>>>>>>> symbol that
>>>>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FWIW, basically headers with symbols that pass both types
>>>>>>>>>>>>>>> of filter are added to the root set. There's 1.)
>>>>>>>>>>>>>>> Filtereing based on whether a symbol is found in a
>>>>>>>>>>>>>>> library, and 2.) based on the used filter patterns
>>>>>>>>>>>>>>> (--include-symbols & --exclude-symbols). I thought it
>>>>>>>>>>>>>>> would be better to exclude a header from the root set if
>>>>>>>>>>>>>>> all the symbols in it were filtered out by the filter
>>>>>>>>>>>>>>> patterns.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yep - I think that sounds sensible
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 22:30:
>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:08:
>>>>>>>>>>>>>>>>> Hi Jorn,
>>>>>>>>>>>>>>>>> this looks very good indeed. I'd like to play with it a
>>>>>>>>>>>>>>>>> bit before we
>>>>>>>>>>>>>>>>> go ahead with this, if that's ok.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please do!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also, one question: if the user does not specify "-l"
>>>>>>>>>>>>>>>>> what happens? I
>>>>>>>>>>>>>>>>> guess the root set will then contain all the headers
>>>>>>>>>>>>>>>>> and we just get
>>>>>>>>>>>>>>>>> back the full output (which would be ok) ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Good questions. When no filter patterns are specified
>>>>>>>>>>>>>>>> that is the case
>>>>>>>>>>>>>>>> everything is outputted, but when filter patterns are
>>>>>>>>>>>>>>>> specified
>>>>>>>>>>>>>>>> headers are only added to the root set if they contain a
>>>>>>>>>>>>>>>> symbol that
>>>>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Maybe we should just fall back to 'include-everything'
>>>>>>>>>>>>>>>> when no "-l"
>>>>>>>>>>>>>>>> options are specified, regardless of filter?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 13/03/2019 18:06, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>> Updated webrev:
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.01/
>>>>>>>>>>>>>>>>>> Fixed the bug, also added tests & ran all the examples
>>>>>>>>>>>>>>>>>> again to try and find other bugs (nothing found).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:50:
>>>>>>>>>>>>>>>>>>> Discovered a bug in this after posting :(
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> DependencyFinder also needs to look for pointee and
>>>>>>>>>>>>>>>>>>> array element type
>>>>>>>>>>>>>>>>>>> cursors (which it currently doesn't). I've tested a
>>>>>>>>>>>>>>>>>>> working fix, but
>>>>>>>>>>>>>>>>>>> will also add some tests for the particular cases.
>>>>>>>>>>>>>>>>>>> Updated webrev
>>>>>>>>>>>>>>>>>>> coming soon™
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:21:
>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> After some discussion:
>>>>>>>>>>>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-March/004828.html
>>>>>>>>>>>>>>>>>>>> I've implemented the discussed approach to header &
>>>>>>>>>>>>>>>>>>>> declaration filtering.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Bug:
>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8220544
>>>>>>>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.00/
>>>>>>>>>>>>>>>>>>>> There are a few new implementation classes:
>>>>>>>>>>>>>>>>>>>> * PattertnFilter; a helper class that encapsulates
>>>>>>>>>>>>>>>>>>>> the Pattern based
>>>>>>>>>>>>>>>>>>>> filtering code previously found in SymbolFilter.
>>>>>>>>>>>>>>>>>>>> This code had to be
>>>>>>>>>>>>>>>>>>>> reused.
>>>>>>>>>>>>>>>>>>>> * Filters; a Context sub-component holding
>>>>>>>>>>>>>>>>>>>> PatternFilters for
>>>>>>>>>>>>>>>>>>>> different types of declarations.
>>>>>>>>>>>>>>>>>>>> * DependencyFinder; finds dependencies of the root
>>>>>>>>>>>>>>>>>>>> set of
>>>>>>>>>>>>>>>>>>>> declarations found in the root headers.
>>>>>>>>>>>>>>>>>>>> * ElementFilter; filters out non-library symbol
>>>>>>>>>>>>>>>>>>>> elements based on
>>>>>>>>>>>>>>>>>>>> whether they appear in a root header or are required
>>>>>>>>>>>>>>>>>>>> by something
>>>>>>>>>>>>>>>>>>>> therein.
>>>>>>>>>>>>>>>>>>>> * RootSet; a helper class for managing root
>>>>>>>>>>>>>>>>>>>> headers + required element Trees.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I also encountered some crashes when debug printing
>>>>>>>>>>>>>>>>>>>> Trees, which I've
>>>>>>>>>>>>>>>>>>>> fixed/worked around accordingly.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> FWIW, I tried this out with the Windows registry API
>>>>>>>>>>>>>>>>>>>> and it's working
>>>>>>>>>>>>>>>>>>>> very nicely! Going from well over 100 headers to
>>>>>>>>>>>>>>>>>>>> under 10. Some room
>>>>>>>>>>>>>>>>>>>> for improvement still exists; some empty headers and
>>>>>>>>>>>>>>>>>>>> static forwarders
>>>>>>>>>>>>>>>>>>>> could still be omitted, but imho that should be
>>>>>>>>>>>>>>>>>>>> handled by a separate
>>>>>>>>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> Jorn
More information about the panama-dev
mailing list