[foreign] RFR 8215502: Jextract output should be self-contained

Tue Dec 18 17:59:27 UTC 2018

Ok, I found the problem.

Before running MissingSymbolTest jextract is ran to generate the missing 
class;

      * @run driver JtregJextract -l Missing -L $(test.nativepath) -t 
test.jextract.missing -- missing.h

When running the command manually I get 2 warnings:

     WARNING: symbol square is not found in any library
     WARNING: symbol cube is not found in any library

Since I'm on Windows, `square` is also missing from the DLL since I 
never added __declspec(dllexport) to it. This is only a problem now 
since jextract stopped generating classes if symbols are missing I 
think?

Adding __declspec(dllexport) fixes the test failure. So it was a problem 
with platform after all.

Cheers,
Jorn

Jorn Vernee schreef op 2018-12-18 18:35:
> Never mind, in that case actual file paths seem to be required, so the
> usage of File.separatorChar is correct.
> 
> I'll try and find out what's happening that's making the test fail.
> 
> Jorn
> 
> Jorn Vernee schreef op 2018-12-18 18:24:
>> No, but running from a clean build yields the same result.
>> 
>> If this is only on my end I think Maurizio is right that this is a
>> problem with platform/cygwin.
>> 
>> I ran into a problem with the Runner test as well, since it was using
>> File.separatorChar to go from binary to internal class names and back.
>> It looks like the new Writer class is doing this as well, but this
>> gives incorrect behaviour on Windows, since the separator char there
>> is '\', but afaik binary class names should always use '/'.
>> 
>> I'll try and see if fixing this will get the test to pass.
>> 
>> Jorn
>> 
>> Sundararajan Athijegannathan schreef op 2018-12-18 17:48:
>>> Was that from a clean build?
>>> 
>>> -Sundar
>>> 
>>> On 18/12/18, 10:00 PM, Jorn Vernee wrote:
>>>> Looks like com/sun/tools/jextract/missing/MissingSymbolTest.java is 
>>>> failing after this patch?
>>>> 
>>>> It looks like it no longer generates the class that the test is 
>>>> testing for: 
>>>> http://cr.openjdk.java.net/~jvernee/panama/test-results/MissingSymbolTest.jtr
>>>> 
>>>> Jorn
>>>> 
>>>> Maurizio Cimadamore schreef op 2018-12-17 19:44:
>>>>> Hi,
>>>>> following external and internal feedback on jextract usability, we
>>>>> have been working hard on a partial jextract rewrite which
>>>>> significantly simplifies how jextract works and how the various 
>>>>> pieces
>>>>> are connected together. The idea is that each extraction run should
>>>>> generate a self-contained artifact; this is _not_ what is happening
>>>>> now, as jextract will only extract things that belong to the header
>>>>> being extracted. This means that if the header depends on symbols
>>>>> defined in some other included header, in a different folder, then 
>>>>> the
>>>>> resulting jar artifact will contain dandling refs and will fail to
>>>>> bind. To overcome this issues user have in the past been forced to
>>>>> 'chase headers' around, and add them to the jextract command line,
>>>>> which is a very frustrating (and system-dependent) process.
>>>>> 
>>>>> Also, few months ago we embarked on a project to parse info 
>>>>> obtained
>>>>> from clang API into an internal representation (see [1]). But we've
>>>>> only gone half-way with this - and the code is in a place where
>>>>> sometimes the tree is used, sometimes the clang types are used, 
>>>>> which
>>>>> can be very unpredictable and hard to follow. This patch fixes all
>>>>> that and makes trees the single source of truth; this means that 
>>>>> some
>>>>> issues in tree parsing had to be rectified, to avoid some common
>>>>> issues (see end of this email for a list).
>>>>> 
>>>>> Here's the webrev:
>>>>> 
>>>>> https://cr.openjdk.java.net/~mcimadamore/panama/8215502/
>>>>> 
>>>>> (Not sure if it's better to look at the diffs or to just look at 
>>>>> the new code)
>>>>> 
>>>>> As it can be seen, classes are much smaller now, each doing a 
>>>>> fairly
>>>>> simple, focussed task:
>>>>> 
>>>>> * Main: this is where we process command line args; we check them 
>>>>> and
>>>>> we set up a Context
>>>>> 
>>>>> * Context: this is where we hold the shared state - mostly holder 
>>>>> for
>>>>> command line args (after validation)
>>>>> 
>>>>> * JextractTool: this is the main jextract tool frontend - it has a
>>>>> method called 'processHeaders' which does all the job. This work is
>>>>> divided in the following steps:
>>>>> 
>>>>>    - parse all headers into trees
>>>>>    - run a bunch of visitors onto the trees (symbol filter, typedef
>>>>> handler, etc.)
>>>>>    - extract all decls from header trees
>>>>>    - get rid of duplicate decls
>>>>>    - group decls by header file
>>>>>    - for each headerfile
>>>>>           + enter types into dictionary (TypeEnter)
>>>>>           + generate bytecode using appropriate code factory
>>>>>    - return a Writer object
>>>>> 
>>>>> * Writer: helper object used to write classes/jars
>>>>> 
>>>>> * HeaderResolver: helper object which keeps track of path/package
>>>>> mappings, and which, given a path, returns the appropriate header 
>>>>> file
>>>>> object. Note: before this patch, all paths were flattened - e.g. if 
>>>>> a
>>>>> header was found under /usr/include/linux/foo.h, jextract would 
>>>>> emit a
>>>>> classfile called usr_include_foo.class. The new patch changes this
>>>>> behavior and generates proper nested package names, so that the 
>>>>> header
>>>>> class will have now name "foo" and package "usr.include.linux".
>>>>> 
>>>>> * TypeDictionary: helper object which, with the help of the
>>>>> HeaderResolver, allows to go from clang types to Java types
>>>>> 
>>>>> 
>>>>> Overall, this feels like a big improvement in terms of how easy it 
>>>>> is
>>>>> to tweak the behavior of jextract - each of the main abstraction is
>>>>> simple enough that it is relatively easy to tweak things to get a
>>>>> different result, which is a huge step forward compared to where 
>>>>> the
>>>>> code is ATM (at least in my opinion, which could be biased of 
>>>>> course
>>>>> :-)). Following this change, we will probably do two followup 
>>>>> changes:
>>>>> 
>>>>> I think the new logic is very deterministic, and easy to follow; 
>>>>> there
>>>>> are two items where this is still suboptimal:
>>>>> 
>>>>> 1) we allow multiple headerfiles as input, and this sometimes 
>>>>> doesn't
>>>>> play well with the parser; we have discussed many times how clang 
>>>>> API
>>>>> behaves better with single input which includes everything. It 
>>>>> would
>>>>> probably be better to restrict jextract to work on a single header
>>>>> input file only (which, thanks to the new resolution
>>>>> logic/self-contained model, now works all the times).
>>>>> 
>>>>> 2) related to the above, as we discussed, the -t option doesn't 
>>>>> make
>>>>> a  lot of sense when multiple input headers are involved. I think 
>>>>> it
>>>>> would be best again to restrict input to be just one headerfile (as
>>>>> per 1), and also transform -t into a proper 'package prefix' option
>>>>> instead, which adds given prefix to all package names generated by
>>>>> jextract. This is very useful to package up self-contained 
>>>>> artifacts
>>>>> in way that do no clash when loaded by same classloader.
>>>>> 
>>>>> As for the issues in tree parsing that were addressed as part of 
>>>>> this
>>>>> patch, here's a list:
>>>>> 
>>>>> * StructTree::nestedTypes is generating a transitive closure of all
>>>>> nested types; this doesn't play well with recursive visitors, as it
>>>>> leads trees to be visited multiple times; I changed the code to 
>>>>> expand
>>>>> only the innermost level of nested types (this affects the
>>>>> EmptyNameHandler visitor)
>>>>> 
>>>>> * Tree::equals/hashcode - this code is doing the comparison using
>>>>> Cursor::equals, which is good, but doesn't work at all across
>>>>> compilation units (e.g. when multiple headers are specified). For 
>>>>> this
>>>>> reason I changed the comparison code to rely on source location 
>>>>> info.
>>>>> 
>>>>> * TypedefHandler has some logic to exclude multiple declaration w/o
>>>>> definition, where the definition occurs elsewhere in the tree. But
>>>>> this logic doesn't visit structs recursively and, as a result, it
>>>>> fails to prune duplicates nested within structs.
>>>>> 
>>>>> P.S.
>>>>> A big thank to Sundar for all the help troubleshooting failing 
>>>>> tests
>>>>> on Mac, and for updating some of the samples instruction in our
>>>>> markdown file :-)
>>>>> 
>>>>> Cheers
>>>>> Maurizio
>>>>> 
>>>>> [1] - 
>>>>> http://mail.openjdk.java.net/pipermail/panama-dev/2018-August/002564.html