TruffleSOM with SimpleLanguage-Style Call Caches

Tue Dec 10 06:42:00 PST 2013

Hello Stefan,

What's the difference between Unary, Binary and Ternary send nodes? Did you
see a need to differentiate between them? They all have arrays of children,
so it's not removing the array indirection. Is it so that executedEvaluated
can have parameters for each child and you don't need to create an array? I
would just use an Object[] here. I used to have Unary, Binary and Ternary
method calls in Ruby, but got rid of them one day and couldn't see any
performance difference when running on Graal (not just any statistically
significant difference - literally no difference), with the benefit of a
drastically simpler model.

I wouldn't worry about the number of nodes. Certainly having just one extra
would never cross my mind as an issue. It may make interpreted performance
a little better to have fewer, but really good compiled performance when
you start to run on Graal relies on many minimal nodes with maximal
specialisations. In Ruby lots of operations that look primitive and
irreducible are made up of several nodes.

If you have n nodes of complexity c that each specialise in k ways you have
n^k possible configurations in space cnk. To have the same number of
configurations with a single node requires space cn^k, which you probably
aren't going to bother to do. So I always maximise n and k and minimise c.

With escaping frames you mention @ExplodeLoop; you also know about
@SlowPath, right?

Regards,

Chris

On 9 December 2013 15:10, Stefan Marr <java at stefan-marr.de> wrote:

> Hi:
>
> I finally got around to change TruffleSOM to use call caching and inlining
> based on the new SimpleLanguage examples.
> The current version uses specialized classes for unary, binary, ternary,
> and keyword messages to allow for a specialization of messages for
> primitive operations based on the argument types. So, this part of the
> design hasn’t changed much from the last iteration. However, the way
> primitive operations (‘builtins’ in SL terminology) are handled has changed.
>
> Below, you’ll find a brief comparison of the last two design iterations,
> and my observations, and at the end, I got another of those ‘escaping
> frames’ issues, I haven’t been able to solve yet.
>
> Direct vs. Indirect Monomorphic Checking
> ———————————————————————————————————————-
>
> In the last design iteration, I tried to use the polymorphic behavior of
> nodes supported by the TruffleDSL to my advantage. However, since
> state-based guards are currently not supported, it was not possible to
> implement the check for a monomorphic send using only the TruffleDSL.
> However, I think, the general design might still have a few benefits, since
> it avoids an additional node in the AST. To make that a little more
> explicit, let’s consider the following example:
>
>     Calculator>>#add: a to: b = ( ^ a + b )
>
> This would result in an AST roughly looking like this:
>
>      Method(RootNode)
>      +- BinaryMessage
>          +- receiver = FieldReadNode
>          +- argument = FieldReadNode
>
> After specialization, it would look like this:
>
>     Method(RootNode)
>     +- AdditionPrimNodeInteger
>             (next)-> AdditionPrimNodeDouble
>                   (next)-> AdditionPrimNodeUninitialized
>          +- receiver = FieldReadNode
>          +- argument = FieldReadNode
>
> Now, the DSL is currently not able to support my approach of having this
> unified with standard inline caching for instance in case Strings are
> ‘added’, were ‘+’ is not implemented in AdditionPrimNode as a
> specialization but as a normal Smalltalk method.
>
> So, I switched to the SimpleLanguage approach, and now the AST looks like
> this after specialization:
>
>     Method(RootNode)
>     +- BinarySendNode$CachedSendNode (current)-> AdditionPrimNodeInteger
>             (next)-> BinarySendNode$CachedSendNode (current)->
> AdditionPrimNodeDouble
>                   (next)-> BinarySendNode$CachedSendNode (current)->
> BinarySendNode$InlinableSendNode (callTarget)-> (String>>#+:)
>                        (next)-> BinarySendNode$UninitializedNode
>          +- receiver = FieldReadNode
>          +- argument = FieldReadNode
>
> The nice thing about this design is that the whole question of monomorphic
> sends is completely handled by the *SendNode’s CachedSendNode
> implementations. And then, all kind of specialization happens after that.
> However, I am not sure whether this indirection is unproblematic for
> compilation.
>
> Another problem is that in my current implementation much of the SendNode
> implementations is duplicated code, slightly adapted to the different types
> and signatures for the Unary/Binary/Ternary/Keyword message nodes.
> Similarly, the specialization for the n-ary message nodes also forces me
> to duplicated code for the variants for inlined methods. All in all, not
> very nice, but with Java’s restricted support for generics, I also don’t
> see a lot of options to avoid it.
>
>
> Escaping Frames
> ——————————————-
>
> Following previous experiences with Graal reporting escaping frames, I
> tried to identify all loops that lead to a call to `executeGeneric(.)` to
> make sure they are marked with @ExplodeLoop. However, in this case I either
> missed one or this strategy is not sufficient.
>
> Any other general strategies that would help to identify potential issues
> for escaping frames?
>
> Thanks
> Stefan
>
> The latest code can be obtained with:
>
>     git clone --recursive https://github.com/smarr/TruffleSOM.git
>     cd TruffleSOM
>     ant jar
>     cd $GRAAL
>     ./mx.sh --vm server vm
> -Xbootclasspath/a:../TruffleSOM/build/classes:$SOM/libs/com.oracle.truffle.api.jar:../TruffleSOM/libs/com.oracle.truffle.api.dsl.jar
> som.vm.Universe -cp ../TruffleSOM/Smalltalk
> ../TruffleSOM/Examples/Benchmarks/Loop.som
>
>
> --
> Stefan Marr
> Software Languages Lab
> Vrije Universiteit Brussel
> Pleinlaan 2 / B-1050 Brussels / Belgium
> http://soft.vub.ac.be/~smarr
> Phone: +32 2 629 2974
> Fax:   +32 2 629 3525
>
>