[foreign-jextract] RFR: 8261642: improve jextract source generation mode and remove class generation

Duncan Gittins duncan.gittins at gmail.com
Sun Feb 14 13:34:33 UTC 2021


The new generation does look cleaner and it's very helpful that the 2 
output types are consistent. I've a few small observations from my 
initial tests:

1) When compiling my --source output I noticed that my jars were still 
quite different sizes to jar of .class file output. This is simply 
because jextract uses these javac parameters: "-parameters" and 
"-g:lines". Could you tell me whether the extra meta-data added by 
"-parameters" is needed for improving the runtime speed of code, and 
therefore should be always be added when compiling --source output?

It might help others if jextract had some way to supply -g:none setting 
though it's not important because -g:none can be handled when javac the 
--source output.

2) generator adds C.class / C.java but I think @C annotations was 
removed in --source output by earlier fixes. My app ran fine without 
C.class, can it be removed?

I still think having an option to cut down the symbols will help when 
run against the monster Windows headers, but Eclipse is happy with what 
I get now even though the jars could be x20 smaller.

Kind regards

Duncan

On 12/02/2021 22:33, Maurizio Cimadamore wrote:
> This patch replaces the existing code generators used by jextract by a single, unified, source generation strategy.
>
> As discussed in [1], there are some trade-offs between source generation and classfile generation; for instance, when using classfile translation, we can take advantage of certain features of the JVM bytecode, such as dynamic constants, which makes it easy to generate big files full of constants which are only initialized on a *by-need* basis.
>
> Classfile generation is not without issue though; first, it is almost always easier to deal with sourcefiles rather than classes - both in terms of building environment and also in terms of sharing (e.g. uploading generated bindings on a GitHub project). This is for instance why we have always used source-based bindings for libclang to power jextract. We feel that users will probably be subject to similar limitations.
>
> Secondly, IDEs can sometimes be picky about what's in the classfile - and I have seen cases of IntelliJ failing to decompile classes containing dynamic constants.
>
> Last, but not least, classfiles cannot contain information which would be helpful for the user, such as javadoc. Adding javadoc comments, containing at the very least the original signatures of the C functions/structs would be very helpful when developing applications which use jextract generated code.
>
> For these reasons, I have decided to explore a source-based strategy for jextract. Note that this does not alter the set of options available in the command line, or the default behavior of jextract: that is, jextract will still emit classfiles if you do not specify --sources; but instead of having _two_ different backends (as we do now), this patch replaces both backends with a single source-based backend. We can discuss, as followup changes, as to whether we should just make source generation the default and drop classfile generation, but that's for another day.
>
> The basic idea behind the approach is to stash constants in a separate class; that is, when we need a native method handle, we can generate a class which contains all the constants relative to that native method handle. This allows us to only load the constants that are used for a given native invocation.
>
> One obvious problem of this approach is the sheer number of classes that it generates (although only a subset of them are truly loaded and initialized). To counteract that, this patch adopts several strategies:
>
> 1. first, it reuses a container class, if that's already available, for storing the constants. For instance, if we are generating a struct class, we can just emit the constants inside the struct class, since the constants are private to that struct anyway (we cannot do that trick for functional interfaces, unfortunately, as these are interfaces, and all fields in interfaces must be public - so there's no way to *hide* the constant fields in there - we could do that if Java supported `private static` fields in interfaces, which is probably not so unreasonable to consider).
>
> 2. The new code avoids some duplication which was present in the old approach when it comes to `#define` constants; these had getters both in the header Java API and in the internal constant source files, therefore almost doubling the size of code required to support a single constant. This is now gone in the new scheme; since constants are pretty uniquitous in header files, this move saves quite a bit of footprint.
>
> 3. To further reduce the number of constant holder classes, I added a very simple grouping logic, so that the same holder class can be reused across N different API points (where N can be tweaked using a JDK property). In other words, if you turn up N to a very high value (e.g. 1000) you end up with the same generation scheme we had before, where all constants end up in the same class (or very few different classes). The default is set to 5, which, in my experiments has proven to be a good balance between fast startup (negligible difference when compared to the dynamic constant classfile approach) and artifact footprint (what comes out of jextract has basically the same size as before).
>
> In terms of implementation, the main idea was to double down on the XYZBuilder interface and, at the same time, simplify it. We now have the following abstractions:
>
> * JavaSourceBuilder: basic abstraction which deals with creating java sources, maintains a backing string buffer, etc. It has a main API, which is used by `OutputFactory` to add variables, functions, constants, structs, functional interfaces, typedefs. This probably represents the biggest architectural shift in the code: previously, OutputFactory was responsible for low-level operation, such as starting and ending source generation - now all these details are encapsulated in the builders and OutputFactory deals with a much simplified API (with the result that OutputFactory became a lot simpler in places, such as `visitVariable` and `visitScoped`).
>
> * HeaderFileBuilder: as before, this is the builder that is responsible for holding the contents of an header file. It can be used to add vars (globals), functions, structs, typedefs and functional interfaces.
>
> * NestedClassBuilder: as before, this is a builder for all things that are nested inside the main header file.
>
> * ConstantBuilder: this is a new class which partly plays the role played by the former `ConstantHelper` - it is now a type of `NestedClassBuilder` which has logic to emit constants. Note that JavaSourceBuilder has a method, namely `emitWithConstantClass` which spins a new constant holder on the fly, and allows some code generation to depends on the constants being added to that holder. ConstantBuilder also has a main API for adding method/var handles, segment/address constants, etc. Each of these methods return a `Constant` instance, which has few methods that can be useful when generating code (e.g. to generate the access expression needed to get the constant, or to emit a getter for the constant inside another builder).
>
> * StructBuilder, FunctionalInterfaceBuilder, TypedefBuilder; as before - the only notable difference being that now StructBuilder extends ConstantBuilder, so it can be the holder of its own constants.
>
> I think this covers the main things that have been touched by this patch - feel free to ask questions, in case I missed something!
>
> Cheers
>
>
> [1] - https://mail.openjdk.java.net/pipermail/panama-dev/2021-January/011887.html
>
> -------------
>
> Commit messages:
>   - Further consolidate constant generation code
>   - Consolidate generation of constant code
>   - Merge branch 'foreign-jextract' into jextract-clean-source-gen
>   - Add JDK parameter to configure number of constants per synthetic class
>   - Encapsulate logic for handling nested anon structs
>   - Fix (pre-existing) issue with iteration order of prefixes in StructBuilder
>   - Add try finally before popping struct names
>   - Fix issues with cross-split typedefs
>   - Fully encapsulate header splitting
>   - Preserve public API by not using functional interfaces as constant holders (this "leaks" constant fields outside).
>   - ... and 15 more: https://git.openjdk.java.net/panama-foreign/compare/bbf0f35c...a902cfdc
>
> Changes: https://git.openjdk.java.net/panama-foreign/pull/450/files
>   Webrev: https://webrevs.openjdk.java.net/?repo=panama-foreign&pr=450&range=00
>    Issue: https://bugs.openjdk.java.net/browse/JDK-8261642
>    Stats: 2765 lines in 16 files changed: 774 ins; 1578 del; 413 mod
>    Patch: https://git.openjdk.java.net/panama-foreign/pull/450.diff
>    Fetch: git fetch https://git.openjdk.java.net/panama-foreign pull/450/head:pull/450
>
> PR: https://git.openjdk.java.net/panama-foreign/pull/450




More information about the panama-dev mailing list