static machine code snippets, via jextract
John Rose
john.r.rose at oracle.com
Tue Aug 23 21:07:38 UTC 2016
Extraction will eventually have to produce a kind of machine code snippet for stuff that cannot be represented as Java metadata and annotations. (Or can we give annotations the ability to carry machine code??) The main use of this will be C++, but first exercise of this should be function-like macros.
FTR here's a design sketch:
$ jextract sys/stat.h --with stat_config.h ...
$ cat stat_config.h
# pragma jextract_config(start)
# pragma jextract_macro_prefix(stat_macro_)
extern int stat_macro_S_ISBLK(mode_t m);
extern int stat_macro_S_ISCHR(mode_t m);
…
The file stat_config.h is hand-written specifically to adjust the extraction of stat.h. It contains declarations
Output of jextract includes a DLL (or just C code) that defines each such function, and associates metadata saying which macro it represents. Metadata for the extracted API mentions these functions, but their source name is the macro, even though they point (somehow) at the function.
The concrete names of the functions are not important, and probably the user-given prefixed names should be ignored and replaced by something like jx00042. The DLL might have a single entry point for resolving symbols, or it might make all of those symbols public, and have a standard entry point (or data) for mapping the symbols to the intended symbolic references. (Or we might choose to mangle the names.)
Eventual use of these snippets may include C macros (both object-like and function-like), C++ inline functions, C++ "universal subclasses", C++ template instantiations, and intrinsic capture (e.g., for funneling AVX stuff from immintrin.h up to somebody who needs it), and arbitrary C-level asm statements, again for intrinsic capture. Some of these (C++ subclasses, etc.) are coordinated sets of snippets, of course.
— John
P.S. Historical note: I used these techniques in the '90s on the esh ("embeddable shell") project, a Scheme VM with a header file extractor. I didn't use pragmas, but instead used a backslash, which required a C parser patch (which we con't do):
$ cat stat_config.h
extern int \S_ISBLK(mode_t m);
extern int \S_ISCHR(mode_t m);
…
Each class with at least N virtuals (N>0) received an unnamed universal subclass, which contained N function pointers to "plug in" replacement behaviors for each virtual. It also contained a void* data pointer, and that was all. The Scheme code for making a subclass of a C++ class Foo with a virtual method bar was something like:
(define my-subc
(let ()
(define (my-bar x) (print "bar called!") (+ x 1))
(define sc (make-subclass +Foo))
(set! (->bar sc) my-bar) ;; generic property, gets bar slot of sub-Foo vtable
(define (make-subc a b) (define x (allocate sc)) (set! (data sc) (list a b)) x)
(add-constructor sc make-subc)
sc))
(define sc1 (make my-subc 1 2))
(define sc2 (make my-subc 3 4))
;; later on I can make a new subclass of Foo by plugging a different ->bar method
More information about the panama-dev
mailing list