Truffle performance problems
Gilles Duboscq
duboscq at ssw.jku.at
Tue Jan 14 03:20:27 PST 2014
Hi,
The code that you can find in the pull request from Bernhard should
provide a reasonable basis to start experimenting with different
execution options.
The current code is very much geared toward executing your example
query as it was derived from this query with a top down approach.
I think the current results should demonstrate that it's definitely
possible to get not only decent performance but even improved
performance using Truffle:
The Truffle version shows a ~1.4x speed up compared to the Java
version running on the Graal compiler and a ~2x speed up when compared
to the Java version running on the C2 compiler.
I also believe that more complex queries and a more complex data
storage could take even more advantage of Truffle's specialization
capabilities.
I hope this will help you in your investigations and we'd be glad to
help if you have further questions.
-Gilles
On Mon, Jan 13, 2014 at 11:42 AM, Bernhard Urban <Bernhard.Urban at jku.at> wrote:
> Hi Dain,
>
> Gilles and I worked a bit on your Truffle example, see the result here:
> https://github.com/dain/presto-truffle/pull/1
>
> Make sure you use a recent version of Graal to get best performance in your
> example.
>
>
> -Bernhard
>
>
>
> On 12/12/2013 11:16 PM, Dain Sundstrom wrote:
>>
>> Hi all,
>>
>> I have been experimenting with Truffle in Presto for a day now and am
>> confused by the performance I am seeing.
>>
>> My high level goal of this experiment is to figure out how I should
>> structure data flow in my Truffle language. Since, I am writing the
>> language and the only user of that language together, I have a lot of
>> options available to me. Specifically, I'd like to figure out if I should
>> take a vectorized approach, a row at a time approach, or some combination of
>> both. Which every solution is fastest, I'll make work in the code base.
>>
>> To this end, I decided to take a top down approach to Truffle (mainly
>> because I am confident the bottom expression bits will be fast). I started
>> with a very simple query hand-coded in Java:
>>
>> double sum = 0;
>> for (row in source) {
>> if (row passes the filter) {
>> sum += row.extendedprice * row.discount
>> }
>> }
>> return sum;
>>
>> When I run that on my machine using 5M rows of input (all in memory), it
>> takes ~165ms using the Graal vm (1.7.0_45) with "-server" option on my
>> laptop.
>>
>> With the performance baseline established, my plan was to start with a
>> single node and then start breaking it apart into more nodes without making
>> stuff slower. So, I wrapped this same code with a single Truffle RootNode.
>> When I execute the same code though the Truffle call, I get the same
>> performance until the node is compiled. Once the node is compiled,
>> performance drops to ~260ms.
>>
>> Now, I understand using a single node is not the point of Truffle, but I
>> would not expect such a massive performance drop off. At this point, I'm not
>> sure if this is a worth while exercise at all.
>>
>> You can find all of the code and instructions on running it here:
>>
>> https://github.com/dain/presto-truffle/tree/master
>>
>> Any ideas or suggestions?
>>
>> Thanks,
>>
>> -dain
>>
>>
>>
>> On a related note, if you leave the Truffle test running it eventually
>> crashes with (https://gist.github.com/dain/c3a29eb81642c86f5072):
>>
>> Found illegal recursive call to
>> HotSpotMethod<Utility.recursiveAppendNumber(StringBuffer, int, int, int)>,
>> must annotate such calls with @CompilerDirectives.SlowPath!
>>
>> I've also found "java.util.concurrent.ExecutionException:
>> java.lang.IllegalStateException: Inlined graph is in invalid state" when
>> executing a CallTarget in tight inner loops.
>>
>
More information about the graal-dev
mailing list