TruffleSOM Update and Issues with Splitting

Mon Mar 10 18:14:48 UTC 2014

Hi:

After deactivating splitting (see previous mail below), I am finally getting somewhere with the performance of TruffleSOM.
See the attached comparison on microbenchmarks, DeltaBlue, and Richards.
RTruffleSOM is a simplified version of TruffleSOM ported to RPython. It doesn’t do much of the specialization TruffleSOM includes, but with the meta-tracer it still has an edge.

Some of the benchmarks on top of TruffleSOM are still pretty slow.
And for the method dispatch for instance, I hope it can still be improved but I don’t have a good idea how to do that yet.
I was experimenting before with custom inlining of simple object field reads, returning arguments, or constants, but that is not yet back in this version.
Such inlining would avoid many of the nodes for frame initialization. And avoiding them seems to help especially for simple operations such as addition.

As an overview of the current status, TruffleSOM includes the following:
 - message sends are design after SimpleLanguage
 - reads/writes of frame slots are specialized for int, double, and Object
 - control structures are optimized with special nodes replacing plain message sends
 - a subset of the primitive operations is eagerly placed as direct expression nodes in the execution tree to avoid overhead of message sends and general inline caches
 - lexical scopes are handled via materialized frames
 - CallNodes are used to realize inlining
 - frame slots are specialized separately in different method copies
   - nested methods/blocks are copied as soon as outer method is copied to allow for their specialization in the correct context
 - loop counts are reported via the LoopCountReceiver interface
 - primitive operations are specialized using the TruffleDSL
and there are probably other things that might be relevant.

Now my main question is what other kind of optimizations are typically done in Truffle languages.
One issue I am seeing is that the SOM objects are requiring SOM objects to be stored in their fields.
I probably could still loosen that requirement, but wonder whether the optimizer will then be able to eliminate unnecessary store and read operations.
Currently, it doesn’t seem to do that, and as a result, the FieldRead benchmark is about a magnitude slow than it should be.

Are there other interesting optimizations I could still try?

Thanks
Stefan 

On 10 Mar 2014, at 12:23, Stefan Marr <java at stefan-marr.de> wrote:

> == Execution Tree is not stabilizing with splitting:
> 
> With splitting is enabled, only a handful of benchmarks stabilize.
> Things like the simple WhileLoop benchmark, but also Richards and DeltaBlue seem to be ok.
> 
> Highly recursive benchmarks such as Fibonacci seem to be problematic.
> After a while, I get for Fibonacci output like the one below [3].
> I don’t see any specialization going on anymore, instead I see those ‘inline success’ reports perpetually.
> I am not sure what to make of that. What exactly is indicated by this message?
> Are the tree copies, i.e., splitted copies gradually inlined? Could it indicate that an upper limit on inlining doesn’t trigger while it probably should?
> Any hints what that could mean are highly appreciate!
> 
> Thanks
> Stefan
> 
> 
> 
> [1] https://github.com/SOM-st/TruffleSOM/blob/master/src/som/interpreter/nodes/nary/EagerBinaryPrimitiveNode.java#L47
> [2] https://github.com/SOM-st/TruffleSOM/commit/3f01afebfbf603e21519c608beff201af3d11a90
> 
> [3] Output for Fibonacci:
> 
> [truffle] opt done         Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 3a507beb <split>|ASTSize  1482 (0/109) |Time   797( 648+149 )ms |Nodes      1246/ 1323 |CodeSize         6768 |Source Examples/Benchmarks//Fibonacci.som:38
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at 23877e40 <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount       54 |inlinedTotalCount     86 |score            0.06 |frequency        1.00 |callCount        1000
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at 2ede6d8e <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount       86 |inlinedTotalCount    118 |score            0.06 |frequency        1.00 |callCount        1000
> [truffle] inline success     Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 1fc40e10 <split>|ASTSize      54 (0/1) |shallowCount       54 |currentCount      118 |inlinedTotalCount    226 |score            0.02 |frequency        1.00 |callCount         999
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at 7895b698 <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount      226 |inlinedTotalCount    290 |score            0.06 |frequency        1.00 |callCount         998
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at e424bba <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount      290 |inlinedTotalCount    354 |score            0.06 |frequency        1.00 |callCount         998
> [truffle] inline success     Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 89ab152 <split>|ASTSize      54 (0/1) |shallowCount       54 |currentCount      354 |inlinedTotalCount    570 |score            0.02 |frequency        1.00 |callCount         997
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at 4203f936 <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount      570 |inlinedTotalCount    698 |score            0.06 |frequency        1.00 |callCount         996
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at 30cd607d <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount      698 |inlinedTotalCount    826 |score            0.06 |frequency        1.00 |callCount         996
> [truffle] inline success     Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at c20f79c|ASTSize      15 (0/1) |shallowCount       15 |currentCount      826 |inlinedTotalCount    946 |score             NaN |frequency         NaN |callCount           0
> [truffle] inline success     Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 3a3e575d <split>|ASTSize      54 (0/1) |shallowCount       54 |currentCount      946 |inlinedTotalCount   1054 |score            0.02 |frequency        1.00 |callCount         999
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at 32579775 <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount     1054 |inlinedTotalCount   1118 |score            0.06 |frequency        1.00 |callCount         998
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at 4acce36b <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount     1118 |inlinedTotalCount   1182 |score            0.06 |frequency        1.00 |callCount         998
> [truffle] inline success     Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 55766e96|ASTSize      15 (0/1) |shallowCount       15 |currentCount     1182 |inlinedTotalCount   1242 |score             NaN |frequency         NaN |callCount           0
> [truffle] inline success     Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 23d9bebe <split>|ASTSize      54 (0/1) |shallowCount       54 |currentCount     1242 |inlinedTotalCount   1674 |score            0.02 |frequency        1.00 |callCount         995
> [truffle] inline success     Method Fibonacci>>#fibonacci::Examples/Benchmarks//Fibonacci.som:37 at 313a51b <split>|ASTSize      16 (0/2) |shallowCount       16 |currentCount     1674 |inlinedTotalCount   1930 |score            0.06 |frequency        1.00 |callCount         994
> [truffle] inline success     Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 5c07f072|ASTSize      15 (0/1) |shallowCount       15 |currentCount     1930 |inlinedTotalCount   2170 |score             NaN |frequency         NaN |callCount           0
> [truffle] opt queued       Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 4d27d539 <split>|ASTSize  2170 (0/149) |C/T        1000/ 1001 |L/T        1000/ 1010 |Inval#/Replace#     0/   38
> [truffle] opt start        Method Fibonacci>>#$block method:Examples/Benchmarks//Fibonacci.som:38 at 4d27d539 <split>|ASTSize  2170 (0/149) |C/T        1000/ 1001 |L/T        1000/ 1010 |Inval#/Replace#     0/   38
> 
> -- 
> Stefan Marr
> INRIA Lille - Nord Europe
> http://stefan-marr.de/research/
> 
> 
> 

-- 
Stefan Marr
INRIA Lille - Nord Europe
http://stefan-marr.de/research/