call vs execute

Christian Humer christian.humer at gmail.com
Wed Jun 25 13:43:15 UTC 2014


Hi Stefan,

Nice to see some parser experiments with Truffle.
I played a bit with your implementation and did some changes in order to
fix some issues to enable compilation (see the attached patch). I have a
few remarks on what was wrong with it.

1) Be aware of "opt fail" in the output.
None of your CallTargets actually got optimized because Truffle compilation
failed for several reasons (see below). If you see "opt fail" messages at
the output its an indication that either you are we did something wrong.
But there is definitely something wrong.

The flag -G:+TruffleCompilationExceptionsAreFatal will ensure that such
issues will be treated like normal exceptions.

2) Always use @Child annotation for nodes.

You used final fields without @Child annotation in order to store nodes
that are called using the frame. Always store child nodes in fields
annotated with @Child or @Children. Otherwise the Truffle tree cannot be
traversed properly by the framework. (that was the reason the inlining did
not work).

In order to verify that inlining works you can use the
-G:+TraceTruffleInlining .


3) Truffle ASTs are trees not graphs.

Never cache and reuse Truffle nodes like you did in ParserState. Truffle
ASTs always need to be trees. Also do not share trees between created
CallTarget instances. At the moment this property is not enforced, but it
may be in the future.

I changed your caching to caching using CallTargets instead of Alternative
node instances.


4) Truffle expands everything on the fast path.

Be aware if you call arbitrary Java methods on the fast path. Even common
methods like HashMap.get may be too much code to expand. An indication for
that is that the compiler fails with the message "too many nodes". In order
to avoid such issues cut your calls to untrusted or arbitrary code with the
@SlowPath annotation. In the parser case I just annotated
ParserState#lookupInEnv with @SlowPath to resolve this issue.

After fixing all those issues the performance for the "cached call"
benchmark improved from 18 seconds to roughly 100ms.

The "Cached execute" benchmark violated the Truffle Tree property so I did
not repair it.

I hope this helps. Feel free to ask more questions.



- Christian Humer


On Tue, Jun 24, 2014 at 10:20 AM, Stefan Fehrenbach <
stefan.fehrenbach at gmail.com> wrote:

> Hello,
>
> I implemented a recursive descend grammar interpreter using Truffle.
> Nonterminal calls are replaced by either of three alternative
> implementations:
> 1. Unoptimized lookup of the nonterminal in a hash table
> 2. Look up nonterminal during replacement, cache result and call the
> cached node's execute method directly.
> 3. Look up nonterminal during replacement, cache result and use
> Truffle's function call support to call the execute method.
>
> Now it happens that the function call variant is a lot slower than the
> other two and even the cached variant is still slower than the naive
> implementation:
>
> Unoptimized: 3095
> Cached execute: 3555
> Cached call: 18521
> (Time to parse a long string with a grammar consisting of a long chain
> of nonterminal calls a couple of times.)
>
> My question is: why is the function call variant so much slower? The
> actual computation is so simple, I expected Truffle/Graal to just
> inline the functions.
> And why is the cached variant still slower than the naive
> implementation. In every parse, we need 150*150 hash table look ups in
> the naive implementation vs only 150*150 Assumption checks in variant
> #2.
>
> Code is here: https://github.com/fehrenbach/parsers
> Uninitialized node:
>
> https://github.com/fehrenbach/parsers/blob/master/src/org/morphling/parsers/truffle/UninitializedNonterminalCall.java
>
> I use the graalvm-jdk1.8.0-0.3 release and the following JVM
> parameters to call this main method:
> org.morphling.parsers.truffle.Tests#main
>
> -server
> -Xss32m
> -Dtruffle.TraceRewrites=true
> -Dtruffle.DetailedRewriteReasons=true
> -G:+TraceTruffleCompilationDetails
> -G:+TraceTruffleCompilation
> -G:TruffleCompilationThreshold=1
> -XX:+UnlockDiagnosticVMOptions
> -XX:-PrintCompilation
>
> Am I just using Truffle the wrong way? What could be the reason for
> the slow function calls and the underperforming cache, even in this
> contrived grammar with long nonterminal chains?
>
> Best regards,
> Stefan
>


More information about the graal-dev mailing list