Vector API: How to write code template specialization ?
Remi Forax
forax at univ-mlv.fr
Mon Apr 13 17:26:15 UTC 2020
Hi all,
as a kind of study to see how to use the vector API to implement a simple runtime [1] for J (that weird descendant of APL :)
It works quite well until you try to share code, let say i have a code to do a reduction on an array,
i can write one version for +, one version for *, etc, or i can write a method that takes a VectorOperations as parameter and the JIT will be smart enough to figure that if i call the method with the constant VectorOperations.ADD, i want the JIT to specialize the method for ADD.
So in my runtime, have a method foldValueADD that calls foldValueTemplate(ADD).
An it fails spectacularly because the JIT think that the template function foldValueTemplate is too big to be inlined.
fr.umlv.vector.CellBenchMark::add_cell (14 bytes)
@ 7 fr.umlv.jruntime.Cell$Dyad::fold (12 bytes) inline (hot)
@ 8 fr.umlv.jruntime.Cell$Fold::<init> (56 bytes) inline (hot)
@ 1 java.lang.Object::<init> (1 bytes) inline (hot)
@ 40 java.util.Objects::requireNonNull (14 bytes) inline (hot)
@ 10 fr.umlv.jruntime.Cell::apply (132 bytes) inline (hot)
@ 1 fr.umlv.jruntime.Cell$Fold::foldVerbs (20 bytes) inline (hot)
@ 58 fr.umlv.jruntime.Cell$Rank$Vector::fold (6 bytes) inline (hot)
@ 2 fr.umlv.jruntime.Cell$Rank$Vector::foldValue (31 bytes) inline (hot)
@ 8 fr.umlv.jruntime.Cell$Backend::foldValue (193 bytes) inline (hot)
@ 21 java.lang.Enum::ordinal (5 bytes) accessor
@ 86 fr.umlv.jruntime.Cell$VectorizedBackend::foldValueADD (14 bytes) inline (hot)
@ 2 java.lang.invoke.Invokers$Holder::linkToTargetMethod (8 bytes) force inline by annotation
@ 4 java.lang.invoke.LambdaForm$MH/0x0000000800067840::invoke (8 bytes) force inline by annotation
@ 10 fr.umlv.jruntime.Cell$VectorizedBackend::foldValueTemplate (110 bytes) already compiled into a big method
@ 17 fr.umlv.jruntime.Cell$Rank::vector (9 bytes) inline (hot)
@ 5 fr.umlv.jruntime.Cell$Rank$Vector::<init> (10 bytes) inline (hot)
@ 1 java.lang.Record::<init> (5 bytes) inline (hot)
@ 1 java.lang.Object::<init> (1 bytes) inline (hot)
@ 27 fr.umlv.jruntime.Cell::<init> (15 bytes) inline (hot)
@ 1 java.lang.Object::<init> (1 bytes) inline (hot)
Given that i'm not developing my code inside the JDK, i can not have access to @ForceInlining.
I think the JIT heuristics need to be tweaked so a method that takes a constants of VectorOperations as parameter is always inlined.
Otherwise, there is no point to expose all the constants in VectorOperations given that even a simple reduction takes enough bytecodes to be considered as a big method for the JIT.
Or maybe there is another solution ?
regards,
Rémi
[1] https://github.com/forax/panama-vector/blob/master/fr.umlv.jruntime/src/main/java/fr/umlv/jruntime/Cell.java#L787
More information about the panama-dev
mailing list