[foreign-jextract] RFR: 8260976: investigate ways to filter jextract output
Maurizio Cimadamore
mcimadamore at openjdk.java.net
Fri Mar 12 19:01:20 UTC 2021
On Fri, 12 Mar 2021 15:24:01 GMT, Jorn Vernee <jvernee at openjdk.org> wrote:
>> This patch adds new filtering capabilities to jextract, along the lines of what discussed in [1] (thanks Duncan for all the insightful comments on the topic!).
>>
>> The patch adds some new options:
>>
>> * `--include-{union, struct, var, function, typedef} = <symbol-name>`
>>
>> When one (or more) options like these are specified, jextract goes into some kind of avdanced filtering mode, and will only emit bindings for the specified symbols. Note: when no `--include-xyz` option is used, jextract behaves exactly like before - e.g. everything is extracted.
>>
>> To help the user with the filtering process, another option has been added:
>>
>> * `--dump-includes <file-name>`
>>
>> When this option is used, jextract will not emit any bindings - instead, it will dump all the included symbol on the specified file. The file is a simple text file, organized as follows (the following has been obtained extracting OpenGL):
>>
>> #### Extracted from: /usr/include/KHR/khrplatform.h
>>
>> --include-macro KHRONOS_BOOLEAN_ENUM_FORCE_SIZE # header: /usr/include/KHR/khrplatform.h
>> --include-macro KHRONOS_FALSE # header: /usr/include/KHR/khrplatform.h
>> --include-macro KHRONOS_MAX_ENUM # header: /usr/include/KHR/khrplatform.h
>> --include-macro KHRONOS_SUPPORT_FLOAT # header: /usr/include/KHR/khrplatform.h
>> --include-macro KHRONOS_SUPPORT_INT64 # header: /usr/include/KHR/khrplatform.h
>> --include-macro KHRONOS_TRUE # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_boolean_enum_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_float_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_int16_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_int32_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_int64_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_int8_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_intptr_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_ssize_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_stime_nanoseconds_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_uint16_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_uint32_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_uint64_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_uint8_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_uintptr_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_usize_t # header: /usr/include/KHR/khrplatform.h
>> --include-typedef khronos_utime_nanoseconds_t # header: /usr/include/KHR/khrplatform.h
>>
>> #### Extracted from: /usr/include/alloca.h
>>
>> --include-function alloca # header: /usr/include/alloca.h
>> --include-macro _ALLOCA_H # header: /usr/include/alloca.h
>>
>> #### Extracted from: /usr/include/endian.h
>>
>> --include-macro BIG_ENDIAN # header: /usr/include/endian.h
>> --include-macro BYTE_ORDER # header: /usr/include/endian.h
>> --include-macro LITTLE_ENDIAN # header: /usr/include/endian.h
>> --include-macro PDP_ENDIAN # header: /usr/include/endian.h
>> --include-macro _ENDIAN_H # header: /usr/include/endian.h
>>
>> In other words, the generated file contains all the symbols that have been inclided, in the form of "options". In fact, the generated file can be played back into jextract - assuming `foo.conf` is the name of our file, the we can do the following:
>>
>> `jextract @foo.conf Foo.h`
>>
>> Multiple `@file` can be used with jextract, and can also be mixed with explicit command-line options (all this support predates the work discussed in this patch, but makes it very worthwhile); this is also the same syntax supported by other launchers like `javac`.
>>
>> The output file should be stable - e.g. all entries are grouped by header file (and header files are sorted alphabetially); whithin each header, entries are sorted by category (e.g. variable vs. function) and _then_ also alphabetically within same category. This should help users finding things quickly.
>>
>> In addition, each line has a comment which states from which header is the symbol coming from. I found this super helpful to define custom header-based filtering scheme using `grep` and regex (similar to what `--filter` used to do).
>>
>> The implementation is relatively straightforward - other than adding support for the new option, it was mostly matter of making sure that OutputFactory does notgenerate the filtered out symbols.
>>
>> [1] - https://mail.openjdk.java.net/pipermail/panama-dev/2021-March/012429.html
>
> src/jdk.incubator.jextract/share/classes/jdk/internal/jextract/impl/IncludeHelper.java line 152:
>
>> 150: maxLengthOptionCol += 1; // space
>> 151: int maxLengthHeaderCol = pathEntries.getKey().toString().length();
>> 152: maxLengthHeaderCol += "# header:".length();
>
> Isn't this always the same because we're only looking at 1 header in the iteration? i.e. there seems no need to specify the size in the format string, and just `%s` could be used?
true - doh!
-------------
PR: https://git.openjdk.java.net/panama-foreign/pull/469
More information about the panama-dev
mailing list