[foreign] RFR 8220544: Jextract; library dependent header filtering

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Mar 14 23:27:24 UTC 2019


Not super sure about this approach - findEntry was there before, aren't 
we changing the semantics of that function with your changes?

This is in jdk/jdk (not Panama)

http://hg.openjdk.java.net/jdk/jdk/file/8a66c92526cb/src/java.base/share/native/libjava/ClassLoader.c#l448

That will most surely cause JNI link differences here

http://hg.openjdk.java.net/jdk/jdk/file/8a66c92526cb/src/java.base/share/classes/java/lang/ClassLoader.java#l2690

That was the reason why I put the changes inside the lookup() method.

We could make a new lookup method for it, but I'd like first to 
understand why you think this filtering cannot be done in Java (since 
the functionality I'm using is supported on all platforms?)

Maurizio

On 14/03/2019 23:00, Jorn Vernee wrote:
> FWIW, I've implemented the changes to findEntry based on your patch: 
> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.04/
>
> Jorn
>
> Jorn Vernee schreef op 2019-03-14 20:53:
>> Sounds good!
>>
>> As for the implementation; since the filtering requirement is platform
>> specific, it seems better to implement findEntry as a platform
>> specific function in java.base/<platform>/native/libjava/jni_util_md.c
>> (e.g. [1]) ?
>>
>> We already have a function there for findEntryInProcess. We could add
>> one for findEntry as well, and then filter on unix.
>>
>> Jorn
>>
>> [1] :
>> http://hg.openjdk.java.net/panama/dev/file/b3831a541894/src/java.base/unix/native/libjava/jni_util_md.c 
>>
>>
>> Maurizio Cimadamore schreef op 2019-03-14 20:37:
>>> This is my take on the issue (see attached patch).
>>>
>>> Basically check path of library symbol against library path itself,
>>> and if they don't match do not return the symbol via Library::lookup.
>>> This should provide stable semantics across platforms.
>>>
>>> Maurizio
>>>
>>> On 14/03/2019 19:10, Maurizio Cimadamore wrote:
>>>> We do have soe support in the VM os class:
>>>>
>>>> // Locate DLL/DSO. On success, full path of the library is copied to
>>>>   // buf, and offset is optionally set to be the distance between addr
>>>>   // and the library's base address. On failure, buf[0] is set to '\0'
>>>>   // and offset is set to -1 (if offset is non-NULL).
>>>>   static bool dll_address_to_library_name(address addr, char* buf,
>>>>                                           int buflen, int* offset);
>>>>
>>>>
>>>> trying to play with it and see what happens. Seems implemented for 
>>>> all OSs.
>>>>
>>>> Maurizio
>>>>
>>>> On 14/03/2019 18:58, Jorn Vernee wrote:
>>>>> Here's the Windows prototype I had for iterating the symbols in a 
>>>>> library: 
>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/itersyms/webrev.00
>>>>>
>>>>> Jorn
>>>>>
>>>>> Jorn Vernee schreef op 2019-03-14 19:50:
>>>>>> Maurizio Cimadamore schreef op 2019-03-14 19:32:
>>>>>>> So, I think that if we don't address this in some way, this work is
>>>>>>> not very effective on Linux/Mac, do yo agree?
>>>>>>
>>>>>> Yes, this is unfortunate...
>>>>>>
>>>>>>> As to how to address it, since there doesn't seem to be a way to
>>>>>>> perform a 'local' library search, what if we augmented the Symbol
>>>>>>> interface to also return a Path (of the shared library the symbol
>>>>>>> belongs to) ?
>>>>>>>
>>>>>>> With that info we should be able to do some more filtering?
>>>>>>>
>>>>>>> To get the path, on *NIX we can use dladdr. Not sure if Win has
>>>>>>> something similar.
>>>>>>
>>>>>> I ran into dladdr as well, but that seems to be a glibc 
>>>>>> extension? On
>>>>>> Windows we have GetModuleFileName [1], so at least that's covered.
>>>>>>
>>>>>> But, it might be preferable to try and implement the symbol table
>>>>>> crawl on the different operating systems, since that's more straight
>>>>>> towards what we want to achieve. I mean, if we're going the
>>>>>> non-portable route any ways... At least on Linux this seems to be
>>>>>> possible as well [2] I want to say that this should be possible 
>>>>>> to do
>>>>>> on most operating systems, as long as we can load the library file
>>>>>> directly into memory, and know the file format.
>>>>>>
>>>>>> If we go that route I would need some help with the other 
>>>>>> platforms of course :)
>>>>>>
>>>>>> Jorn
>>>>>>
>>>>>> [1] :
>>>>>> https://docs.microsoft.com/en-us/windows/desktop/api/libloaderapi/nf-libloaderapi-getmodulefilenamea 
>>>>>> [2] :
>>>>>> https://stackoverflow.com/questions/26960377/how-to-programmatically-list-elf-shared-library-symbols 
>>>>>>
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 14/03/2019 17:41, Maurizio Cimadamore wrote:
>>>>>>>>
>>>>>>>> On 14/03/2019 17:36, Jorn Vernee wrote:
>>>>>>>>> I think you got it
>>>>>>>>>
>>>>>>>>> With dlsym it seems to be recursive: [1] "The search performed 
>>>>>>>>> by dlsym() is breadth first through the dependency tree of 
>>>>>>>>> these libraries."
>>>>>>>>>
>>>>>>>>> But, GetProcAdress on Windows [2] looks at the export table of 
>>>>>>>>> the library you give it, so AFAICT this is non-recursive.
>>>>>>>>
>>>>>>>> Yep - found the same, you beat me to it :-)
>>>>>>>>
>>>>>>>> Maurizio
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Jorn
>>>>>>>>>
>>>>>>>>> [1] : https://linux.die.net/man/3/dlsym
>>>>>>>>> [2] : 
>>>>>>>>> https://docs.microsoft.com/en-us/windows/desktop/api/libloaderapi/nf-libloaderapi-getprocaddress
>>>>>>>>>
>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 18:25:
>>>>>>>>>> Nope - with this:
>>>>>>>>>>
>>>>>>>>>> $ cat foo.h
>>>>>>>>>> #include <string.h>
>>>>>>>>>>
>>>>>>>>>> void foo(int f);
>>>>>>>>>>
>>>>>>>>>> $ cat foo.c
>>>>>>>>>> #include "foo.h"
>>>>>>>>>>
>>>>>>>>>> void foo(int f) { }
>>>>>>>>>>
>>>>>>>>>> then,
>>>>>>>>>>
>>>>>>>>>> $  gcc -shared -o libfoo.so foo.c
>>>>>>>>>>
>>>>>>>>>> $ jextract -L . -l foo -d classes foo.h
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> And then
>>>>>>>>>>
>>>>>>>>>> classes
>>>>>>>>>> ├── clang_support
>>>>>>>>>> │   ├── stddef.class
>>>>>>>>>> │   ├── stddef_h.class
>>>>>>>>>> │   └── stddef$size_t.class
>>>>>>>>>> ├── foo.class
>>>>>>>>>> ├── foo_h.class
>>>>>>>>>> ├── META-INF
>>>>>>>>>> │   └── jextract.properties
>>>>>>>>>> └── usr
>>>>>>>>>>     └── include
>>>>>>>>>>         ├── string.class
>>>>>>>>>>         ├── string_h.class
>>>>>>>>>>         ├── xlocale.class
>>>>>>>>>>         ├── xlocale_h.class
>>>>>>>>>>         ├── xlocale.class
>>>>>>>>>>         ├── xlocale.class
>>>>>>>>>>         └── xlocale.class
>>>>>>>>>>
>>>>>>>>>> 4 directories, 13 files
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I wonder if some of the library lookup functions we have on 
>>>>>>>>>> linux
>>>>>>>>>> (dlopen and co.) are doing a 'recursive' search into 
>>>>>>>>>> dependent libs as
>>>>>>>>>> well, which might be disabled on Win.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Maurizio
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 14/03/2019 16:38, Maurizio Cimadamore wrote:
>>>>>>>>>>> Ah ok
>>>>>>>>>>>
>>>>>>>>>>> will give that another try
>>>>>>>>>>>
>>>>>>>>>>> Maurizio
>>>>>>>>>>>
>>>>>>>>>>> On 14/03/2019 16:16, Jorn Vernee wrote:
>>>>>>>>>>>> Sorry, that's with the v3 patch 
>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.03/ 
>>>>>>>>>>>> I reverted back after the concerns raised about changing 
>>>>>>>>>>>> the implementation of TypedefTree::def You're right in that 
>>>>>>>>>>>> this incorrectly includes some trees.
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry,
>>>>>>>>>>>> Jorn
>>>>>>>>>>>>
>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-14 17:08:
>>>>>>>>>>>>> This produces test.class and test_h.class as expected, but 
>>>>>>>>>>>>> nothing else...
>>>>>>>>>>>>>
>>>>>>>>>>>>> Used command is: `jextract -L . -l test test.h`
>>>>>>>>>>>>>
>>>>>>>>>>>>> You would get stuff from string.h when the -l is omitted.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>
>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 17:00:
>>>>>>>>>>>>>> I think you can reproduce simply by having an header like 
>>>>>>>>>>>>>> this:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> #include <string.h>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> void foo(int i);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> if you extract that you would expect one symbol, but in 
>>>>>>>>>>>>>> reality you
>>>>>>>>>>>>>> get also all the stuff from string.h.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 14/03/2019 15:49, Jorn Vernee wrote:
>>>>>>>>>>>>>>> The Windows jextract artifact for the OpenGL sample is a 
>>>>>>>>>>>>>>> different beast, because it relies on the standard 
>>>>>>>>>>>>>>> Windows OpenGL libraries. So, seeing some system headers 
>>>>>>>>>>>>>>> is to be expected :) In the case of python I'm also 
>>>>>>>>>>>>>>> seeing some system headers, because python depends on a 
>>>>>>>>>>>>>>> hand full of corecrt types, so that's also correct. But, 
>>>>>>>>>>>>>>> neither one seems to be referencing imaxabs, so I'm not 
>>>>>>>>>>>>>>> sure what's going on...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'll try to see if I can find a repro example on Windows 
>>>>>>>>>>>>>>> to see about a fix. I'm not sure I see how the 
>>>>>>>>>>>>>>> SymbolFilter could get to the VM libraries? Though I 
>>>>>>>>>>>>>>> remember seeing a similar case in the past...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 15:55:
>>>>>>>>>>>>>>>> I think I got at the bottom of this. I tried debugging 
>>>>>>>>>>>>>>>> an extraction
>>>>>>>>>>>>>>>> of opengl, and it eventually ends up adding inttypes.h 
>>>>>>>>>>>>>>>> to the rootset
>>>>>>>>>>>>>>>> because of this symbol:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> imaxabs
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The library filter doesn't work here, because this is a 
>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>> defined in the set of default libraries that comes with 
>>>>>>>>>>>>>>>> the VM, so
>>>>>>>>>>>>>>>> jextract 'thinks' this symbol is genuinely part of one 
>>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>>> libraries specified in the command line - except is not.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I tried an hacky fix which checks for absence of symbol in
>>>>>>>>>>>>>>>> Libraries.getDefaultLibrary - and that seems to give 
>>>>>>>>>>>>>>>> the expected
>>>>>>>>>>>>>>>> results - but I leave this to your consideration.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 14/03/2019 14:27, Maurizio Cimadamore wrote:
>>>>>>>>>>>>>>>>> Hi Jorn,
>>>>>>>>>>>>>>>>> I gave this a try with our OpenGL and Python examples. 
>>>>>>>>>>>>>>>>> The extraction process works fine, and the generated 
>>>>>>>>>>>>>>>>> jars are a bit smaller - but I still see a lot of 
>>>>>>>>>>>>>>>>> system-related headers. Python has still lots of them, 
>>>>>>>>>>>>>>>>> OpenGL less so, but even in the GL case, I think it's 
>>>>>>>>>>>>>>>>> interesting, because I can't see references to types 
>>>>>>>>>>>>>>>>> mentioned in these headers (nor in the resulting 
>>>>>>>>>>>>>>>>> classfiles). Do you know, on top of your head, as to 
>>>>>>>>>>>>>>>>> why this could be happening?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 14/03/2019 12:45, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>> Updated webrev: 
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.04/ 
>>>>>>>>>>>>>>>>>> Changed Treemaker implementation to recurse to the 
>>>>>>>>>>>>>>>>>> 'root' type when creating TypdefTree::def, and using 
>>>>>>>>>>>>>>>>>> that in DependencyFinder.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-14 12:49:
>>>>>>>>>>>>>>>>>>> Oh ok, sorry. I'll try that out.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 12:43:
>>>>>>>>>>>>>>>>>>>> On 14/03/2019 11:40, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>> I see what you mean.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Well, `struct BAR` has a definition cursor, but 
>>>>>>>>>>>>>>>>>>>>> `struct BAR*` does not have a 
>>>>>>>>>>>>>>>>>>>>> definition/declaration, it's a type that's 
>>>>>>>>>>>>>>>>>>>>> automatically derived from `struct BAR`. So for 
>>>>>>>>>>>>>>>>>>>>> `struct BAR*` there is no good value for 
>>>>>>>>>>>>>>>>>>>>> TypedefTree::def.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I know that there's no good value for struct BAR* - 
>>>>>>>>>>>>>>>>>>>> I was wondering if
>>>>>>>>>>>>>>>>>>>> the parser should detect the pointer, and recurse 
>>>>>>>>>>>>>>>>>>>> down, then store
>>>>>>>>>>>>>>>>>>>> 'struct BAR' into 'def'.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> TypedefTree::def seems to be a helper method 
>>>>>>>>>>>>>>>>>>>>> mostly targeted at inline type definitions 
>>>>>>>>>>>>>>>>>>>>> (looking at TypedefHandler [1]).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I think for my use case using 
>>>>>>>>>>>>>>>>>>>>> TypedefTree.type().canonicalType() is good enough, 
>>>>>>>>>>>>>>>>>>>>> but I guess we could query that once in the parser 
>>>>>>>>>>>>>>>>>>>>> and then store it in a field in TypedefTree?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> [1] : 
>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/panama/dev/file/43ed53f40957/src/jdk.jextract/share/classes/com/sun/tools/jextract/TypedefHandler.java#l107
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 11:57:
>>>>>>>>>>>>>>>>>>>>>> On 14/03/2019 00:23, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>> Sorry, bad example. The actual failure case is:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>      struct BAR { int x; }
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>      typedef struct BAR* FOO;
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> The TreeMaker code looks at the canonical type 
>>>>>>>>>>>>>>>>>>>>>>> of the typedef `struct BAR*` which is neither a 
>>>>>>>>>>>>>>>>>>>>>>> type definition or declaration, so 
>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def is empty. But, in the case of 
>>>>>>>>>>>>>>>>>>>>>>> dependecy analysis, we still want to inspect 
>>>>>>>>>>>>>>>>>>>>>>> that type, get the pointee type and add it to 
>>>>>>>>>>>>>>>>>>>>>>> the root set.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Right - and I think what I'm claiming is that 
>>>>>>>>>>>>>>>>>>>>>> this is probably some
>>>>>>>>>>>>>>>>>>>>>> kind of a parser bug? That is, if Typedef::def 
>>>>>>>>>>>>>>>>>>>>>> (we should probably
>>>>>>>>>>>>>>>>>>>>>> find another name, 'def' is not good enough IMHO) 
>>>>>>>>>>>>>>>>>>>>>> is set correctly for
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> typedef struct BAR FOO;
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> it should also be set for
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> typedef struct BAR* FOO;
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> That is, I claim that from the typedef tree you 
>>>>>>>>>>>>>>>>>>>>>> should always see a
>>>>>>>>>>>>>>>>>>>>>> referenced type declaration w/o using the clang API.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-14 01:14:
>>>>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 01:04:
>>>>>>>>>>>>>>>>>>>>>>>>> So, if I understand correctly, the issue is 
>>>>>>>>>>>>>>>>>>>>>>>>> that if you have:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> typedef struct { int x; } FOO;
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> you want the struct to be added to the root 
>>>>>>>>>>>>>>>>>>>>>>>>> set. Which make sense.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Now, with respect to getUnderlyingType - I'm a 
>>>>>>>>>>>>>>>>>>>>>>>>> bit at odds; it seems
>>>>>>>>>>>>>>>>>>>>>>>>> to me that the jextract tree should have the 
>>>>>>>>>>>>>>>>>>>>>>>>> right info, under
>>>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def, at least that is set by the 
>>>>>>>>>>>>>>>>>>>>>>>>> parser for
>>>>>>>>>>>>>>>>>>>>>>>>> declarations/definitions.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> That exact case is covered by TypedefTree::def, 
>>>>>>>>>>>>>>>>>>>>>>>> but it doesn't cover a
>>>>>>>>>>>>>>>>>>>>>>>> case like this:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>     struct BAR { int x; }
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>     typedef struct BAR FOO;
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> i.e. TypedefTree::def is only set when the 
>>>>>>>>>>>>>>>>>>>>>>>> typedef contains an inline
>>>>>>>>>>>>>>>>>>>>>>>> definition or a prototype. But, if the type is 
>>>>>>>>>>>>>>>>>>>>>>>> just referenced,
>>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def is an empty Optional.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Wouldn't visiting TypedefTree::def (if 
>>>>>>>>>>>>>>>>>>>>>>>>> present) be enough (and more in
>>>>>>>>>>>>>>>>>>>>>>>>> sync with the rest of jextract) ? After all, 
>>>>>>>>>>>>>>>>>>>>>>>>> that's the kind of thing
>>>>>>>>>>>>>>>>>>>>>>>>> you wanna be sure not to miss by the 
>>>>>>>>>>>>>>>>>>>>>>>>> dependency analysis -
>>>>>>>>>>>>>>>>>>>>>>>>> getUnderlyingType seems a tad too noisy?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> This is what I had before, but it fell short in 
>>>>>>>>>>>>>>>>>>>>>>>> some cases.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> In other words, let's say that there's a 
>>>>>>>>>>>>>>>>>>>>>>>>> typedef in an header of a
>>>>>>>>>>>>>>>>>>>>>>>>> library you want to jextract which is related, 
>>>>>>>>>>>>>>>>>>>>>>>>> via a chain of other 42
>>>>>>>>>>>>>>>>>>>>>>>>> typedefs to something in some obscure system 
>>>>>>>>>>>>>>>>>>>>>>>>> library. Are you sure you
>>>>>>>>>>>>>>>>>>>>>>>>> want the resulting artifact to have all the 42 
>>>>>>>>>>>>>>>>>>>>>>>>> intermediate typedefs?
>>>>>>>>>>>>>>>>>>>>>>>>> After all, the library you wanna extract has 
>>>>>>>>>>>>>>>>>>>>>>>>> just defined 'its
>>>>>>>>>>>>>>>>>>>>>>>>> version' of the name for that particular 
>>>>>>>>>>>>>>>>>>>>>>>>> construct... so it's very
>>>>>>>>>>>>>>>>>>>>>>>>> likely that users of that library will prefer 
>>>>>>>>>>>>>>>>>>>>>>>>> that name to the other
>>>>>>>>>>>>>>>>>>>>>>>>> 42 ones?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> 42 intermediate typedefs seems a bit extreme...
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> But, I think you have a point. Putting the 
>>>>>>>>>>>>>>>>>>>>>>>> typedef and the canonical
>>>>>>>>>>>>>>>>>>>>>>>> type into the root set is probably enough.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On 13/03/2019 23:31, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> Update webrev: 
>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.02/ 
>>>>>>>>>>>>>>>>>>>>>>>>>> This fixes an issue where typedef underlying 
>>>>>>>>>>>>>>>>>>>>>>>>>> types were not being added to the root set.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> It turns out I was using the wrong API to get 
>>>>>>>>>>>>>>>>>>>>>>>>>> the underlying type, and the needed API 
>>>>>>>>>>>>>>>>>>>>>>>>>> didn't have a binding yet :)
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> The changes with respect to the previous webrev;
>>>>>>>>>>>>>>>>>>>>>>>>>>   * added 
>>>>>>>>>>>>>>>>>>>>>>>>>> Cursor.getTypedefDeclUnderlyingType() binding
>>>>>>>>>>>>>>>>>>>>>>>>>>   * added TypdefTree.underlyingType() method
>>>>>>>>>>>>>>>>>>>>>>>>>>   * changed DependencyFinder to use this 
>>>>>>>>>>>>>>>>>>>>>>>>>> method instead of typeDefinition()
>>>>>>>>>>>>>>>>>>>>>>>>>>   * added test case for the failure case 
>>>>>>>>>>>>>>>>>>>>>>>>>> (also simplified test class to only run 
>>>>>>>>>>>>>>>>>>>>>>>>>> jextract once and reuse the result)
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 22:43:
>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13/03/2019 21:36, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Good questions. When no filter patterns 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are specified that is the case
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> everything is outputted, but when filter 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> patterns are specified
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> headers are only added to the root set if 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they contain a symbol that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> FWIW, basically headers with symbols that 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> pass both types of filter are added to the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> root set. There's 1.) Filtereing based on 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> whether a symbol is found in a library, and 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2.) based on the used filter patterns 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> (--include-symbols & --exclude-symbols). I 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> thought it would be better to exclude a 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> header from the root set if all the symbols 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> in it were filtered out by the filter 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> patterns.
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Yep - I think that sounds sensible
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 22:30:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 22:08:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jorn,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this looks very good indeed. I'd like to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> play with it a bit before we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> go ahead with this, if that's ok.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please do!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also, one question: if the user does not 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specify "-l" what happens? I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> guess the root set will then contain all 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the headers and we just get
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> back the full output (which would be ok) ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Good questions. When no filter patterns 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are specified that is the case
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> everything is outputted, but when filter 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> patterns are specified
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> headers are only added to the root set if 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they contain a symbol that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maybe we should just fall back to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'include-everything' when no "-l"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> options are specified, regardless of filter?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13/03/2019 18:06, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Updated webrev: 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.01/ 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fixed the bug, also added tests & ran 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> all the examples again to try and find 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other bugs (nothing found).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:50:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Discovered a bug in this after posting :(
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DependencyFinder also needs to look for 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pointee and array element type
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cursors (which it currently doesn't). 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've tested a working fix, but
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will also add some tests for the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> particular cases. Updated webrev
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> coming soon™
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 17:21:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> After some discussion:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-March/004828.html 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've implemented the discussed 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> approach to header & declaration 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filtering.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bug: 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8220544 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Webrev: 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.00/ 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> There are a few new implementation 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> classes:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * PattertnFilter; a helper class 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that encapsulates the Pattern based
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filtering code previously found in 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SymbolFilter. This code had to be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reused.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * Filters; a Context sub-component 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> holding PatternFilters for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> different types of declarations.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * DependencyFinder; finds 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dependencies of the root set of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> declarations found in the root headers.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * ElementFilter; filters out 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> non-library symbol elements based on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> whether they appear in a root header 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or are required by something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> therein.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * RootSet; a helper class for 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> managing root headers + required 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> element Trees.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I also encountered some crashes when 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> debug printing Trees, which I've
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed/worked around accordingly.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FWIW, I tried this out with the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Windows registry API and it's working
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> very nicely! Going from well over 100 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> headers to under 10. Some room
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for improvement still exists; some 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> empty headers and static forwarders
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> could still be omitted, but imho that 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be handled by a separate
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn


More information about the panama-dev mailing list