[foreign] RFR 8220544: Jextract; library dependent header filtering

Jorn Vernee jbvernee at xs4all.nl
Fri Mar 15 01:26:18 UTC 2019


Here's the updated webrev: 
http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.05/

This includes your patch as well.

Thanks,
Jorn

Jorn Vernee schreef op 2019-03-15 01:03:
> Oh... I thought findEntry was added as part of Panama. I guess the
> current difference in behavior between platforms exists in jdk/jdk as
> well then.
> 
> Your patch works as well, I just thought that the VM code should try
> to offer up uniform behavior, i.e. findEntry should always do a
> non-recursive lookup on every platform. But, If the function is
> pre-existing we should probably avoid changing the behavior.
> 
> I've tried with your patch as well, but 5 tests are failing with
> symbol lookup errors. It seems to be a problem with the default
> library... i.e. we should not do filtering when using the default
> library because the library name will always be "<default>". I'm
> testing a fix.
> 
> Jorn
> 
> Maurizio Cimadamore schreef op 2019-03-15 00:27:
>> Not super sure about this approach - findEntry was there before,
>> aren't we changing the semantics of that function with your changes?
>> 
>> This is in jdk/jdk (not Panama)
>> 
>> http://hg.openjdk.java.net/jdk/jdk/file/8a66c92526cb/src/java.base/share/native/libjava/ClassLoader.c#l448
>> 
>> That will most surely cause JNI link differences here
>> 
>> http://hg.openjdk.java.net/jdk/jdk/file/8a66c92526cb/src/java.base/share/classes/java/lang/ClassLoader.java#l2690
>> 
>> That was the reason why I put the changes inside the lookup() method.
>> 
>> We could make a new lookup method for it, but I'd like first to
>> understand why you think this filtering cannot be done in Java (since
>> the functionality I'm using is supported on all platforms?)
>> 
>> Maurizio
>> 
>> On 14/03/2019 23:00, Jorn Vernee wrote:
>>> FWIW, I've implemented the changes to findEntry based on your patch: 
>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.04/
>>> 
>>> Jorn
>>> 
>>> Jorn Vernee schreef op 2019-03-14 20:53:
>>>> Sounds good!
>>>> 
>>>> As for the implementation; since the filtering requirement is 
>>>> platform
>>>> specific, it seems better to implement findEntry as a platform
>>>> specific function in 
>>>> java.base/<platform>/native/libjava/jni_util_md.c
>>>> (e.g. [1]) ?
>>>> 
>>>> We already have a function there for findEntryInProcess. We could 
>>>> add
>>>> one for findEntry as well, and then filter on unix.
>>>> 
>>>> Jorn
>>>> 
>>>> [1] :
>>>> http://hg.openjdk.java.net/panama/dev/file/b3831a541894/src/java.base/unix/native/libjava/jni_util_md.c 
>>>> Maurizio Cimadamore schreef op 2019-03-14 20:37:
>>>>> This is my take on the issue (see attached patch).
>>>>> 
>>>>> Basically check path of library symbol against library path itself,
>>>>> and if they don't match do not return the symbol via 
>>>>> Library::lookup.
>>>>> This should provide stable semantics across platforms.
>>>>> 
>>>>> Maurizio
>>>>> 
>>>>> On 14/03/2019 19:10, Maurizio Cimadamore wrote:
>>>>>> We do have soe support in the VM os class:
>>>>>> 
>>>>>> // Locate DLL/DSO. On success, full path of the library is copied 
>>>>>> to
>>>>>>   // buf, and offset is optionally set to be the distance between 
>>>>>> addr
>>>>>>   // and the library's base address. On failure, buf[0] is set to 
>>>>>> '\0'
>>>>>>   // and offset is set to -1 (if offset is non-NULL).
>>>>>>   static bool dll_address_to_library_name(address addr, char* buf,
>>>>>>                                           int buflen, int* 
>>>>>> offset);
>>>>>> 
>>>>>> 
>>>>>> trying to play with it and see what happens. Seems implemented for 
>>>>>> all OSs.
>>>>>> 
>>>>>> Maurizio
>>>>>> 
>>>>>> On 14/03/2019 18:58, Jorn Vernee wrote:
>>>>>>> Here's the Windows prototype I had for iterating the symbols in a 
>>>>>>> library: 
>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/itersyms/webrev.00
>>>>>>> 
>>>>>>> Jorn
>>>>>>> 
>>>>>>> Jorn Vernee schreef op 2019-03-14 19:50:
>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 19:32:
>>>>>>>>> So, I think that if we don't address this in some way, this 
>>>>>>>>> work is
>>>>>>>>> not very effective on Linux/Mac, do yo agree?
>>>>>>>> 
>>>>>>>> Yes, this is unfortunate...
>>>>>>>> 
>>>>>>>>> As to how to address it, since there doesn't seem to be a way 
>>>>>>>>> to
>>>>>>>>> perform a 'local' library search, what if we augmented the 
>>>>>>>>> Symbol
>>>>>>>>> interface to also return a Path (of the shared library the 
>>>>>>>>> symbol
>>>>>>>>> belongs to) ?
>>>>>>>>> 
>>>>>>>>> With that info we should be able to do some more filtering?
>>>>>>>>> 
>>>>>>>>> To get the path, on *NIX we can use dladdr. Not sure if Win has
>>>>>>>>> something similar.
>>>>>>>> 
>>>>>>>> I ran into dladdr as well, but that seems to be a glibc 
>>>>>>>> extension? On
>>>>>>>> Windows we have GetModuleFileName [1], so at least that's 
>>>>>>>> covered.
>>>>>>>> 
>>>>>>>> But, it might be preferable to try and implement the symbol 
>>>>>>>> table
>>>>>>>> crawl on the different operating systems, since that's more 
>>>>>>>> straight
>>>>>>>> towards what we want to achieve. I mean, if we're going the
>>>>>>>> non-portable route any ways... At least on Linux this seems to 
>>>>>>>> be
>>>>>>>> possible as well [2] I want to say that this should be possible 
>>>>>>>> to do
>>>>>>>> on most operating systems, as long as we can load the library 
>>>>>>>> file
>>>>>>>> directly into memory, and know the file format.
>>>>>>>> 
>>>>>>>> If we go that route I would need some help with the other 
>>>>>>>> platforms of course :)
>>>>>>>> 
>>>>>>>> Jorn
>>>>>>>> 
>>>>>>>> [1] :
>>>>>>>> https://docs.microsoft.com/en-us/windows/desktop/api/libloaderapi/nf-libloaderapi-getmodulefilenamea 
>>>>>>>> [2] :
>>>>>>>> https://stackoverflow.com/questions/26960377/how-to-programmatically-list-elf-shared-library-symbols
>>>>>>>>> Maurizio
>>>>>>>>> 
>>>>>>>>> On 14/03/2019 17:41, Maurizio Cimadamore wrote:
>>>>>>>>>> 
>>>>>>>>>> On 14/03/2019 17:36, Jorn Vernee wrote:
>>>>>>>>>>> I think you got it
>>>>>>>>>>> 
>>>>>>>>>>> With dlsym it seems to be recursive: [1] "The search 
>>>>>>>>>>> performed by dlsym() is breadth first through the dependency 
>>>>>>>>>>> tree of these libraries."
>>>>>>>>>>> 
>>>>>>>>>>> But, GetProcAdress on Windows [2] looks at the export table 
>>>>>>>>>>> of the library you give it, so AFAICT this is non-recursive.
>>>>>>>>>> 
>>>>>>>>>> Yep - found the same, you beat me to it :-)
>>>>>>>>>> 
>>>>>>>>>> Maurizio
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Jorn
>>>>>>>>>>> 
>>>>>>>>>>> [1] : https://linux.die.net/man/3/dlsym
>>>>>>>>>>> [2] : 
>>>>>>>>>>> https://docs.microsoft.com/en-us/windows/desktop/api/libloaderapi/nf-libloaderapi-getprocaddress
>>>>>>>>>>> 
>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 18:25:
>>>>>>>>>>>> Nope - with this:
>>>>>>>>>>>> 
>>>>>>>>>>>> $ cat foo.h
>>>>>>>>>>>> #include <string.h>
>>>>>>>>>>>> 
>>>>>>>>>>>> void foo(int f);
>>>>>>>>>>>> 
>>>>>>>>>>>> $ cat foo.c
>>>>>>>>>>>> #include "foo.h"
>>>>>>>>>>>> 
>>>>>>>>>>>> void foo(int f) { }
>>>>>>>>>>>> 
>>>>>>>>>>>> then,
>>>>>>>>>>>> 
>>>>>>>>>>>> $  gcc -shared -o libfoo.so foo.c
>>>>>>>>>>>> 
>>>>>>>>>>>> $ jextract -L . -l foo -d classes foo.h
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> And then
>>>>>>>>>>>> 
>>>>>>>>>>>> classes
>>>>>>>>>>>> ├── clang_support
>>>>>>>>>>>> │   ├── stddef.class
>>>>>>>>>>>> │   ├── stddef_h.class
>>>>>>>>>>>> │   └── stddef$size_t.class
>>>>>>>>>>>> ├── foo.class
>>>>>>>>>>>> ├── foo_h.class
>>>>>>>>>>>> ├── META-INF
>>>>>>>>>>>> │   └── jextract.properties
>>>>>>>>>>>> └── usr
>>>>>>>>>>>>     └── include
>>>>>>>>>>>>         ├── string.class
>>>>>>>>>>>>         ├── string_h.class
>>>>>>>>>>>>         ├── xlocale.class
>>>>>>>>>>>>         ├── xlocale_h.class
>>>>>>>>>>>>         ├── xlocale.class
>>>>>>>>>>>>         ├── xlocale.class
>>>>>>>>>>>>         └── xlocale.class
>>>>>>>>>>>> 
>>>>>>>>>>>> 4 directories, 13 files
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> I wonder if some of the library lookup functions we have on 
>>>>>>>>>>>> linux
>>>>>>>>>>>> (dlopen and co.) are doing a 'recursive' search into 
>>>>>>>>>>>> dependent libs as
>>>>>>>>>>>> well, which might be disabled on Win.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Maurizio
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On 14/03/2019 16:38, Maurizio Cimadamore wrote:
>>>>>>>>>>>>> Ah ok
>>>>>>>>>>>>> 
>>>>>>>>>>>>> will give that another try
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 14/03/2019 16:16, Jorn Vernee wrote:
>>>>>>>>>>>>>> Sorry, that's with the v3 patch 
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.03/ 
>>>>>>>>>>>>>> I reverted back after the concerns raised about changing 
>>>>>>>>>>>>>> the implementation of TypedefTree::def You're right in 
>>>>>>>>>>>>>> that this incorrectly includes some trees.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Sorry,
>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-14 17:08:
>>>>>>>>>>>>>>> This produces test.class and test_h.class as expected, 
>>>>>>>>>>>>>>> but nothing else...
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Used command is: `jextract -L . -l test test.h`
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> You would get stuff from string.h when the -l is omitted.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 17:00:
>>>>>>>>>>>>>>>> I think you can reproduce simply by having an header 
>>>>>>>>>>>>>>>> like this:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> #include <string.h>
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> void foo(int i);
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> if you extract that you would expect one symbol, but in 
>>>>>>>>>>>>>>>> reality you
>>>>>>>>>>>>>>>> get also all the stuff from string.h.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On 14/03/2019 15:49, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>> The Windows jextract artifact for the OpenGL sample is 
>>>>>>>>>>>>>>>>> a different beast, because it relies on the standard 
>>>>>>>>>>>>>>>>> Windows OpenGL libraries. So, seeing some system 
>>>>>>>>>>>>>>>>> headers is to be expected :) In the case of python I'm 
>>>>>>>>>>>>>>>>> also seeing some system headers, because python depends 
>>>>>>>>>>>>>>>>> on a hand full of corecrt types, so that's also 
>>>>>>>>>>>>>>>>> correct. But, neither one seems to be referencing 
>>>>>>>>>>>>>>>>> imaxabs, so I'm not sure what's going on...
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I'll try to see if I can find a repro example on 
>>>>>>>>>>>>>>>>> Windows to see about a fix. I'm not sure I see how the 
>>>>>>>>>>>>>>>>> SymbolFilter could get to the VM libraries? Though I 
>>>>>>>>>>>>>>>>> remember seeing a similar case in the past...
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 15:55:
>>>>>>>>>>>>>>>>>> I think I got at the bottom of this. I tried debugging 
>>>>>>>>>>>>>>>>>> an extraction
>>>>>>>>>>>>>>>>>> of opengl, and it eventually ends up adding inttypes.h 
>>>>>>>>>>>>>>>>>> to the rootset
>>>>>>>>>>>>>>>>>> because of this symbol:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> imaxabs
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The library filter doesn't work here, because this is 
>>>>>>>>>>>>>>>>>> a function
>>>>>>>>>>>>>>>>>> defined in the set of default libraries that comes 
>>>>>>>>>>>>>>>>>> with the VM, so
>>>>>>>>>>>>>>>>>> jextract 'thinks' this symbol is genuinely part of one 
>>>>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>>>>> libraries specified in the command line - except is 
>>>>>>>>>>>>>>>>>> not.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I tried an hacky fix which checks for absence of 
>>>>>>>>>>>>>>>>>> symbol in
>>>>>>>>>>>>>>>>>> Libraries.getDefaultLibrary - and that seems to give 
>>>>>>>>>>>>>>>>>> the expected
>>>>>>>>>>>>>>>>>> results - but I leave this to your consideration.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On 14/03/2019 14:27, Maurizio Cimadamore wrote:
>>>>>>>>>>>>>>>>>>> Hi Jorn,
>>>>>>>>>>>>>>>>>>> I gave this a try with our OpenGL and Python 
>>>>>>>>>>>>>>>>>>> examples. The extraction process works fine, and the 
>>>>>>>>>>>>>>>>>>> generated jars are a bit smaller - but I still see a 
>>>>>>>>>>>>>>>>>>> lot of system-related headers. Python has still lots 
>>>>>>>>>>>>>>>>>>> of them, OpenGL less so, but even in the GL case, I 
>>>>>>>>>>>>>>>>>>> think it's interesting, because I can't see 
>>>>>>>>>>>>>>>>>>> references to types mentioned in these headers (nor 
>>>>>>>>>>>>>>>>>>> in the resulting classfiles). Do you know, on top of 
>>>>>>>>>>>>>>>>>>> your head, as to why this could be happening?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On 14/03/2019 12:45, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>> Updated webrev: 
>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.04/ 
>>>>>>>>>>>>>>>>>>>> Changed Treemaker implementation to recurse to the 
>>>>>>>>>>>>>>>>>>>> 'root' type when creating TypdefTree::def, and using 
>>>>>>>>>>>>>>>>>>>> that in DependencyFinder.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-14 12:49:
>>>>>>>>>>>>>>>>>>>>> Oh ok, sorry. I'll try that out.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 12:43:
>>>>>>>>>>>>>>>>>>>>>> On 14/03/2019 11:40, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>> I see what you mean.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Well, `struct BAR` has a definition cursor, but 
>>>>>>>>>>>>>>>>>>>>>>> `struct BAR*` does not have a 
>>>>>>>>>>>>>>>>>>>>>>> definition/declaration, it's a type that's 
>>>>>>>>>>>>>>>>>>>>>>> automatically derived from `struct BAR`. So for 
>>>>>>>>>>>>>>>>>>>>>>> `struct BAR*` there is no good value for 
>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> I know that there's no good value for struct BAR* 
>>>>>>>>>>>>>>>>>>>>>> - I was wondering if
>>>>>>>>>>>>>>>>>>>>>> the parser should detect the pointer, and recurse 
>>>>>>>>>>>>>>>>>>>>>> down, then store
>>>>>>>>>>>>>>>>>>>>>> 'struct BAR' into 'def'.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def seems to be a helper method 
>>>>>>>>>>>>>>>>>>>>>>> mostly targeted at inline type definitions 
>>>>>>>>>>>>>>>>>>>>>>> (looking at TypedefHandler [1]).
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I think for my use case using 
>>>>>>>>>>>>>>>>>>>>>>> TypedefTree.type().canonicalType() is good 
>>>>>>>>>>>>>>>>>>>>>>> enough, but I guess we could query that once in 
>>>>>>>>>>>>>>>>>>>>>>> the parser and then store it in a field in 
>>>>>>>>>>>>>>>>>>>>>>> TypedefTree?
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> [1] : 
>>>>>>>>>>>>>>>>>>>>>>> http://hg.openjdk.java.net/panama/dev/file/43ed53f40957/src/jdk.jextract/share/classes/com/sun/tools/jextract/TypedefHandler.java#l107
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 11:57:
>>>>>>>>>>>>>>>>>>>>>>>> On 14/03/2019 00:23, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> Sorry, bad example. The actual failure case is:
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>      struct BAR { int x; }
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>      typedef struct BAR* FOO;
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> The TreeMaker code looks at the canonical type 
>>>>>>>>>>>>>>>>>>>>>>>>> of the typedef `struct BAR*` which is neither a 
>>>>>>>>>>>>>>>>>>>>>>>>> type definition or declaration, so 
>>>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def is empty. But, in the case of 
>>>>>>>>>>>>>>>>>>>>>>>>> dependecy analysis, we still want to inspect 
>>>>>>>>>>>>>>>>>>>>>>>>> that type, get the pointee type and add it to 
>>>>>>>>>>>>>>>>>>>>>>>>> the root set.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Right - and I think what I'm claiming is that 
>>>>>>>>>>>>>>>>>>>>>>>> this is probably some
>>>>>>>>>>>>>>>>>>>>>>>> kind of a parser bug? That is, if Typedef::def 
>>>>>>>>>>>>>>>>>>>>>>>> (we should probably
>>>>>>>>>>>>>>>>>>>>>>>> find another name, 'def' is not good enough 
>>>>>>>>>>>>>>>>>>>>>>>> IMHO) is set correctly for
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> typedef struct BAR FOO;
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> it should also be set for
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> typedef struct BAR* FOO;
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> That is, I claim that from the typedef tree you 
>>>>>>>>>>>>>>>>>>>>>>>> should always see a
>>>>>>>>>>>>>>>>>>>>>>>> referenced type declaration w/o using the clang 
>>>>>>>>>>>>>>>>>>>>>>>> API.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-14 01:14:
>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-14 
>>>>>>>>>>>>>>>>>>>>>>>>>> 01:04:
>>>>>>>>>>>>>>>>>>>>>>>>>>> So, if I understand correctly, the issue is 
>>>>>>>>>>>>>>>>>>>>>>>>>>> that if you have:
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> typedef struct { int x; } FOO;
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> you want the struct to be added to the root 
>>>>>>>>>>>>>>>>>>>>>>>>>>> set. Which make sense.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Now, with respect to getUnderlyingType - I'm 
>>>>>>>>>>>>>>>>>>>>>>>>>>> a bit at odds; it seems
>>>>>>>>>>>>>>>>>>>>>>>>>>> to me that the jextract tree should have the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> right info, under
>>>>>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def, at least that is set by the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> parser for
>>>>>>>>>>>>>>>>>>>>>>>>>>> declarations/definitions.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> That exact case is covered by 
>>>>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def, but it doesn't cover a
>>>>>>>>>>>>>>>>>>>>>>>>>> case like this:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>     struct BAR { int x; }
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>     typedef struct BAR FOO;
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> i.e. TypedefTree::def is only set when the 
>>>>>>>>>>>>>>>>>>>>>>>>>> typedef contains an inline
>>>>>>>>>>>>>>>>>>>>>>>>>> definition or a prototype. But, if the type is 
>>>>>>>>>>>>>>>>>>>>>>>>>> just referenced,
>>>>>>>>>>>>>>>>>>>>>>>>>> TypedefTree::def is an empty Optional.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Wouldn't visiting TypedefTree::def (if 
>>>>>>>>>>>>>>>>>>>>>>>>>>> present) be enough (and more in
>>>>>>>>>>>>>>>>>>>>>>>>>>> sync with the rest of jextract) ? After all, 
>>>>>>>>>>>>>>>>>>>>>>>>>>> that's the kind of thing
>>>>>>>>>>>>>>>>>>>>>>>>>>> you wanna be sure not to miss by the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> dependency analysis -
>>>>>>>>>>>>>>>>>>>>>>>>>>> getUnderlyingType seems a tad too noisy?
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> This is what I had before, but it fell short 
>>>>>>>>>>>>>>>>>>>>>>>>>> in some cases.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> In other words, let's say that there's a 
>>>>>>>>>>>>>>>>>>>>>>>>>>> typedef in an header of a
>>>>>>>>>>>>>>>>>>>>>>>>>>> library you want to jextract which is 
>>>>>>>>>>>>>>>>>>>>>>>>>>> related, via a chain of other 42
>>>>>>>>>>>>>>>>>>>>>>>>>>> typedefs to something in some obscure system 
>>>>>>>>>>>>>>>>>>>>>>>>>>> library. Are you sure you
>>>>>>>>>>>>>>>>>>>>>>>>>>> want the resulting artifact to have all the 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 42 intermediate typedefs?
>>>>>>>>>>>>>>>>>>>>>>>>>>> After all, the library you wanna extract has 
>>>>>>>>>>>>>>>>>>>>>>>>>>> just defined 'its
>>>>>>>>>>>>>>>>>>>>>>>>>>> version' of the name for that particular 
>>>>>>>>>>>>>>>>>>>>>>>>>>> construct... so it's very
>>>>>>>>>>>>>>>>>>>>>>>>>>> likely that users of that library will prefer 
>>>>>>>>>>>>>>>>>>>>>>>>>>> that name to the other
>>>>>>>>>>>>>>>>>>>>>>>>>>> 42 ones?
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 42 intermediate typedefs seems a bit 
>>>>>>>>>>>>>>>>>>>>>>>>>> extreme...
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> But, I think you have a point. Putting the 
>>>>>>>>>>>>>>>>>>>>>>>>>> typedef and the canonical
>>>>>>>>>>>>>>>>>>>>>>>>>> type into the root set is probably enough.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13/03/2019 23:31, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Update webrev: 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.02/ 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> This fixes an issue where typedef underlying 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> types were not being added to the root set.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> It turns out I was using the wrong API to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> get the underlying type, and the needed API 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> didn't have a binding yet :)
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> The changes with respect to the previous 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> webrev;
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * added 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cursor.getTypedefDeclUnderlyingType() 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> binding
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * added TypdefTree.underlyingType() method
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * changed DependencyFinder to use this 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> method instead of typeDefinition()
>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * added test case for the failure case 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> (also simplified test class to only run 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> jextract once and reuse the result)
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 22:43:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13/03/2019 21:36, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Good questions. When no filter patterns 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are specified that is the case
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> everything is outputted, but when filter 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> patterns are specified
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> headers are only added to the root set if 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they contain a symbol that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FWIW, basically headers with symbols that 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pass both types of filter are added to the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> root set. There's 1.) Filtereing based on 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> whether a symbol is found in a library, 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and 2.) based on the used filter patterns 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (--include-symbols & --exclude-symbols). I 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thought it would be better to exclude a 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> header from the root set if all the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> symbols in it were filtered out by the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter patterns.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yep - I think that sounds sensible
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 22:30:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio Cimadamore schreef op 2019-03-13 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 22:08:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jorn,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this looks very good indeed. I'd like to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> play with it a bit before we
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> go ahead with this, if that's ok.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Please do!
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also, one question: if the user does not 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specify "-l" what happens? I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> guess the root set will then contain all 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the headers and we just get
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> back the full output (which would be ok) 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Good questions. When no filter patterns 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are specified that is the case
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> everything is outputted, but when filter 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> patterns are specified
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> headers are only added to the root set if 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> they contain a symbol that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> passes the filter.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maybe we should just fall back to 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 'include-everything' when no "-l"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> options are specified, regardless of 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Maurizio
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 13/03/2019 18:06, Jorn Vernee wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Updated webrev: 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.01/ 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Fixed the bug, also added tests & ran 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> all the examples again to try and find 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other bugs (nothing found).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 17:50:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Discovered a bug in this after posting 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> :(
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DependencyFinder also needs to look 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for pointee and array element type
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cursors (which it currently doesn't). 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've tested a working fix, but
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will also add some tests for the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> particular cases. Updated webrev
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> coming soon™
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn Vernee schreef op 2019-03-13 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 17:21:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> After some discussion:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://mail.openjdk.java.net/pipermail/panama-dev/2019-March/004828.html 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I've implemented the discussed 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> approach to header & declaration 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filtering.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bug: 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8220544 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Webrev: 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~jvernee/panama/webrevs/8220544/webrev.00/ 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> There are a few new implementation 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> classes:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * PattertnFilter; a helper class 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that encapsulates the Pattern based
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filtering code previously found in 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SymbolFilter. This code had to be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> reused.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * Filters; a Context sub-component 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> holding PatternFilters for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> different types of declarations.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * DependencyFinder; finds 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dependencies of the root set of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> declarations found in the root 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> headers.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * ElementFilter; filters out 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> non-library symbol elements based on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> whether they appear in a root header 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or are required by something
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> therein.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>   * RootSet; a helper class for 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> managing root headers + required 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> element Trees.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I also encountered some crashes when 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> debug printing Trees, which I've
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fixed/worked around accordingly.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FWIW, I tried this out with the 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Windows registry API and it's working
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> very nicely! Going from well over 100 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> headers to under 10. Some room
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for improvement still exists; some 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> empty headers and static forwarders
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> could still be omitted, but imho that 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> should be handled by a separate
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> patch.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jorn


More information about the panama-dev mailing list