[Truffle] Eliminating Calls to Side-Effect Free Methods?
Chris Seaton
chris at chrisseaton.com
Wed Mar 12 15:12:45 UTC 2014
The problem with SequenceNode might be in the way that you hang onto the
final result. That mutable local might be upsetting the PE.
@Override
@ExplodeLoop
public Object executeGeneric(final VirtualFrame frame) {
Object last = expressions[0].executeGeneric(frame);
for (int i = 1; i < expressions.length; i++) {
last = expressions[i].executeGeneric(frame);
}
return last;
}
Also, you're calling executeGeneric for all children, when you could
executeVoid for all but the last. In excuteVoid you can only do
side-effects and avoid producing a value.
This is how Ruby does it, and it does go to nothing.
@ExplodeLoop
@Override
public Object execute(VirtualFrame frame) {
for (int n = 0; n < body.length - 1; n++) {
body[n].executeVoid(frame);
}
return body[body.length - 1].execute(frame);
}
@ExplodeLoop
@Override
public void executeVoid(VirtualFrame frame) {
for (int n = 0; n < body.length; n++) {
body[n].executeVoid(frame);
}
}
Chris
On 12 March 2014 13:14, Stefan Marr <java at stefan-marr.de> wrote:
> Hi:
>
> With the latest changes, TruffleSOM seems to get closer to ideal
> performance,
> however, there seem to be some general issues that do not yet work as I
> would hope/expect.
>
> I got a couple of micro benchmarks, which I would expect to be reduced to
> zero in an ideal situation.
>
> For instance the Dispatch benchmark. Essentially, it is two nested loops
> and a mono-morphic message send to a method that returns it’s argument.
>
> It looks more or less like this:
>
> benchmark = ( 1 to: 20000 do: [ :i | self method: i ] )
> method: argument = ( ^argument )
>
> outerBenchmarkLoop = (
> 1 to: innerIterations do: [ self benchmark ]
> )
>
> Now, the first tricky part is that a self send inside a loop implies the
> access to the outer lexical scope, and thereby the use of a materialized
> frame. I experimented a little with it, and it turns out that in the outer
> loop, the use of the materialized frame seems to be properly eliminated, in
> the inner loop however, it is not. As shown above, both loops are in
> separate methods, but are inlined as far as I can see in IGV.
>
> That's one of the issues. The other one is the cost of introducing a
> sequence node.
> When I increase the number of statements in the inner loop from 1 to 2, my
> compiler introduces an additional SequenceNode to hold them. I would expect
> that to be compiled away and not to imply any overhead, but its payload.
> However, going from 1 to 2 increases the runtime by more than the factor 2.
> Going to 3 or 4 statements shows then a more linear increase.
>
> But again, the main point is that these methods don't do anything but
> producing heat. So, I would really like to see them eliminated.
>
> In order to identify the culprit in IGV, I don't have enough experience
> yet. There are to many things I cannot really correlate with the input
> program.
>
> Are there perhaps known patterns I could look for to point out further
> optimization/specialization potential?
>
> Thanks a lot
> Stefan
>
> PS: to see the issue, you can do the following:
>
> git clone --recursive git at github.com:SOM-st/TruffleSOM.git
> ant jar
> mx --vm server vm -G:Dump=
> -Xbootclasspath/a:build/classes:libs/truffle.jar som.vm.Universe -cp
> Smalltalk:Examples/Benchmarks/DeltaBlue/
> Examples/Benchmarks/BenchmarkHarness.som Dispatch 1000 0 2000
>
> And there, in the output, the method is actually called
> #innerBenchmarkLoop as part of the benchmark harness.
> Should be the second to last thing that’s compiled after the benchmark
> completed.
>
> --
> Stefan Marr
> INRIA Lille - Nord Europe
> http://stefan-marr.de/research/
>
>
>
>
More information about the graal-dev
mailing list