[foreign] RFR 8219470: Use clang API to parse macros
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Feb 20 18:56:22 UTC 2019
Hi,
macro support in jextract was added some time ago, using javac's
constant folding support to evaluate expressions. While clever, that
approach has some limitations - namely it is not possible for it to
understand types that belong to the C language. This will make
eventually impossible to support constants such as this:
#define PTR (void*)0
Clang offers an 'evaluation' API [1], but unfortunately this API
inexplicably doesn't work on macros. But it does work on regular
variable declarations. So here's an idea - given a macro of the kind:
#define NAME VALUE
let's generate a snippet like this:
__auto_type jextract$NAME = NAME;
and see what comes out of clang. The __auto_type extension is a GNU
extension which is also supported by clang [2]; this is rather handy
because it allows us to rely on clang to do type inference too!
The problem with this approach is, of course, to speed up the snippet
recompilation enough - to this extent three measures were taken:
* instead of generating a snippet with an #include - using clang API [3]
we save the jextract translation unit onto a precompiled header - these
headers won't change anyway
* we then parse the snippet with -import-pch <precompiler header>; this
allows to skip all symbols that are defined outside the snippet (to do
this you have to create a 'local' Index - that's why I exposed that part
of the clang API)
* instead of writing onto a file over and over, we make use of clang's
in-memory file support [4]. This allows us to create an empty file once,
and to keep passing snippets as strings in memory.
The result is quite pleasing - not only we now parse macros 'the right
way' but performances got a significant bump; on my machine (before/after):
Opengl 5s/3s
Python 6s/3.7s
Ncurses 3s/1.5s
Almost 2x boost - not bad. On top of that, by diffing the --log FINE
output it seems like the new implementation is able to pick up an
handful of constants that were left out in the previous implementation.
Note that, this patch retains the previous optimization for special
casing simple numeric #define - where we just try to parse the number in
Java. For API such as OpenGL with loads of constants, this is an
essential optimization.
Webrev:
http://cr.openjdk.java.net/~mcimadamore/panama/8219470/
Cheers
Maurizio
[1] -
https://clang.llvm.org/doxygen/group__CINDEX__MISC.html#ga6be809ca82538f4a610d9a5b18a10ccb
[2] - https://reviews.llvm.org/D12686
[3] -
https://clang.llvm.org/doxygen/group__CINDEX__TRANSLATION__UNIT.html#ga3abe9df81f9fef269d737d82720c1d33
[4] - https://clang.llvm.org/doxygen/structCXUnsavedFile.html
More information about the panama-dev
mailing list