where are our performance bottlenecks?
Christian Thalinger
christian.thalinger at oracle.com
Thu Jul 7 01:02:37 PDT 2011
On Jul 7, 2011, at 8:52 AM, Christian Thalinger wrote:
> On Jul 6, 2011, at 10:49 PM, Tom Rodriguez wrote:
>>
>> On Jul 6, 2011, at 4:18 AM, Christian Thalinger wrote:
>>
>>> On Jul 5, 2011, at 6:39 PM, Charles Oliver Nutter wrote:
>>>> I'm not in position at this exact moment to report perf issues, but
>>>> Rémi's list would be a good start. I'll return to JRuby benchmarks and
>>>> start looking for specific bottlenecks.
>>>
>>> OK.
>>>
>>>>
>>>> As reported in some of my my previous emails, JRuby has several uses
>>>> of indy that are off by default, so it will be nice to start getting
>>>> them enabled.
>>>
>>> When I use -Xinvokedynamic.all=true with bench_string_ops.rb I get:
>>>
>>> InvokeDynamicSupport.java:710:in `fixnum_op_mul': java.lang.ClassCastException: org.jruby.RubyString cannot be cast to org.jruby.RubyFixnum
>>>
>>> I just saw a benchmark I haven't seen before: bench_avi_base64.rb
>>>
>>> Performance with indy is not very good:
>>>
>>> intelsdv07:~/mlvm/jruby$ jruby --server -Xcompile.invokedynamic=false bench/bench_avi_base64.rb
>>> 1.569000 0.000000 1.569000 ( 1.539000)
>>> 0.895000 0.000000 0.895000 ( 0.895000)
>>> 0.850000 0.000000 0.850000 ( 0.850000)
>>> 0.848000 0.000000 0.848000 ( 0.848000)
>>> 0.848000 0.000000 0.848000 ( 0.848000)
>>>
>>> intelsdv07:~/mlvm/jruby$ jruby --server bench/bench_avi_base64.rb
>>> 2.335000 0.000000 2.335000 ( 2.305000)
>>> 1.503000 0.000000 1.503000 ( 1.503000)
>>> 1.470000 0.000000 1.470000 ( 1.470000)
>>> 1.479000 0.000000 1.479000 ( 1.479000)
>>> 1.470000 0.000000 1.470000 ( 1.470000)
>>>
>>> The pattern I always see when I look at the inlining tree of a badly performing benchmark is this one:
>>>
>>> @ 9 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::invocationFallback (197 bytes) inline (hot)
>>
>> I would think we don't want this inlined since it's the fallback path. Try -XX:CompileCommand=dontinline,*,invocationFallback. Inlining it may cause us to run up against other limits like the NodeInliningCutoff and DesiredMethodLimit.
>
> Ahh, right. This is inlined because of how we promote the invocation count of the call site into the method handle chain. Sorry, I forgot.
Hmm, now I'm confused. Excluding the method from compilation helps a bit:
intelsdv07:~/mlvm/jruby$ jruby --server -J-XX:CompileCommand=dontinline,*.invocationFallback bench/bench_avi_base64.rb
CompilerOracle: dontinline *.invocationFallback
1.941000 0.000000 1.941000 ( 1.869000)
1.081000 0.000000 1.081000 ( 1.081000)
1.045000 0.000000 1.045000 ( 1.045000)
1.040000 0.000000 1.040000 ( 1.041000)
1.044000 0.000000 1.044000 ( 1.044000)
But then I tried -X+C:
intelsdv07:~/mlvm/jruby$ jruby -X+C --server -Xcompile.invokedynamic=false bench/bench_avi_base64.rb
1.512000 0.000000 1.512000 ( 1.484000)
0.892000 0.000000 0.892000 ( 0.892000)
0.845000 0.000000 0.845000 ( 0.846000)
0.840000 0.000000 0.840000 ( 0.840000)
0.844000 0.000000 0.844000 ( 0.844000)
intelsdv07:~/mlvm/jruby$ jruby -X+C --server -J-XX:CompileCommand=dontinline,*.invocationFallback bench/bench_avi_base64.rb
CompilerOracle: dontinline *.invocationFallback
1.477000 0.000000 1.477000 ( 1.447000)
0.794000 0.000000 0.794000 ( 0.794000)
0.745000 0.000000 0.745000 ( 0.745000)
0.741000 0.000000 0.741000 ( 0.741000)
0.744000 0.000000 0.744000 ( 0.744000)
intelsdv07:~/mlvm/jruby$ jruby -X+C --server bench/bench_avi_base64.rb
1.642000 0.000000 1.642000 ( 1.614000)
0.808000 0.000000 0.808000 ( 0.808000)
0.767000 0.000000 0.767000 ( 0.767000)
0.763000 0.000000 0.763000 ( 0.763000)
0.769000 0.000000 0.769000 ( 0.770000)
So what is -X+C actually doing? The helps states:
-X+C force compilation of all scripts before they are run (except eval)
But I supposed that all hot scripts are compiled in JRuby in the end anyway. Is that wrong?
-- Christian
More information about the mlvm-dev
mailing list