[foreign] some JMH benchmarks
Jorn Vernee
jbvernee at xs4all.nl
Fri Sep 14 17:39:02 UTC 2018
> So, in principle, we could define a bunch of native entry points in
> the VM, one per shape, which take a bunch of long and doubles and call
> an underlying function with those arguments. For instance, let's
> consider the case of a native function which is modelled in Java as:
>
> int m(Pointer<Foo>, double)
>
> To call this native function we have to first turn the Java arguments
> into a (long, double) pair. Then we need to call a native adapter that
> looks like the following:
>
> jlong NI_invokeNative_J_JD(JNIEnv *env, jobject _unused, jlong addr,
> jlong arg0, jdouble arg1) {
> return ((jlong (*)(jlong, jdouble))addr)(arg0, arg1);
> }
>
> And this will take care of calling the native function and returning
> the value back. This is, admittedly, a very simple solution; of course
> there are limitations: we have to define a bunch of specialized native
> entry point (and Java entry points, for callbacks). But here we can
> play a trick: most of moderns ABI pass arguments in registers; for
> instance System V ABI [5] uses up to 6 (!!) integer registers and 7
> (!!) MMXr registers for FP values - this gives us a total of 13
> registers available for argument passing. Which covers quite a lot of
> cases. Now, if we have a call where _all_ arguments are passed in
> registers, then the order in which these arguments are declared in the
> adapter doesn't matter! That is, since FP-values will always be passed
> in different register from integral values, we can just define entry
> points which look like these:
>
> invokeNative_V_DDDDD
> invokeNative_V_JDDDD
> invokeNative_V_JJDDD
> invokeNative_V_JJJDD
> invokeNative_V_JJJJD
> invokeNative_V_JJJJJ
>
> That is, for a given arity (5 in this case), we can just put all long
> arguments in front, and the double arguments after that. That is, we
> don't need to generate all possible permutations of J/D in all
> positions - as the adapter will always do the same thing (read: load
> from same registers) for all equivalent combinations. This keeps the
> number of entry points in check - and it also poses some challenges
> to the Java logic in charge of marshalling/unmarshalling, as there's
> an extra permutation step involved (although that is not something
> super-hard to address).
I'm wondering if the 5 native end points for an arity of 5 are enough.
Don't you also need 5 for when the function returns a long and 5 more
for when the function returns a double?
I have a suggestion to bypass having to write out all the permutations
though. What if, on the Java side, whenever there is a method that has a
shape that can be optimized in this way (which is ABI dependent), spin
and load a class which defines a single static native method with the
needed signature, and annotate it. Then, in NativeLookup::lookup, detect
this annotation, and instead of trying to look up the symbol in a loaded
library generate a forwarding stub and link the native method to that
instead. Then you can take a MethodHandle to the native method in the
anonymous class and use that in the backing implementation.
I'm not sure if it's all that easy though, what do you think?
Jorn
More information about the panama-dev
mailing list