Hacking Truffle to avoid argument array allocation at calls
Arthur Peters
amp at cs.utexas.edu
Mon Oct 29 22:34:41 UTC 2018
Esteemed Truffle Hackers,
**
I want to improve function call performance in Truffle on HotSpot. Would
it be possible to hack Truffle to support a few specific function
arities without allocating the argument array?
It appears that OptimizedCallTarget.callBoundary
<https://github.com/oracle/graal/blob/master/compiler/src/org.graalvm.compiler.truffle.runtime/src/org/graalvm/compiler/truffle/runtime/OptimizedCallTarget.java#L250-L266>
could be duplicated for specific arities (with the duplicates taking the
specific number of args) and then doInvoke
<https://github.com/oracle/graal/blob/master/compiler/src/org.graalvm.compiler.truffle.runtime/src/org/graalvm/compiler/truffle/runtime/OptimizedCallTarget.java#L246-L248>
could be modified to profile the argument count and specialize the call
to the arity specific version of callBoundary.
1. Is there something special about how these methods are compiled that
would prevent this? (searching on github makes me think the only
magic is that @TruffleCallBoundary prevents inlining)
2. Would the partial evaluator/compiler be able to eliminate the arrays
passed to doInvoke and stored in the VirtualFrame since both of
those are local to a single PE/compilation unit?
Thanks.
-Arthur
*Background for the interested:*
I'm working on performance improvements for my Truffle language
(Orc/PorcE
<https://github.com/orc-lang/orc/tree/improved-porce-heuristics/PorcE>).
I've realized that argument array allocations (which occur for every
Truffle call on HotSpot) are a MAJOR performance problem for me
(something like 5GB/s of allocations). This is because my code is
partially continuation passing style and does a lot of function calls.
I'm looking for a temporary solution that allows me to continue my
research without too much engineering. I'm evaluating various options,
including reducing the number of calls, running my system on SVM, and
hacking Truffle to avoid the allocations for specific arities.
The the first option is challenging because the system is pretty deeply
CPS. The second option is hard because the system uses MethodHandles
(and the usage is complex, so even converting to reflection would not
totally solve the SVM issues). So I'm focusing on hacking Truffle,
because this is a research project and using a modified version is fine.
More information about the graal-dev
mailing list