Good news, bad news
Charles Oliver Nutter
headius at headius.com
Mon May 23 15:06:07 PDT 2011
FWIW, perf with indy versus monomorphic inline caching on that
bench_method_dispatch_only benchmark:
~/projects/jruby ➔ jruby --server -X+C
bench/language/bench_method_dispatch_only.rbTest ruby method: 1000k
loops calling self's foo 10 times
1.129000 0.000000 1.129000 ( 0.662000)
0.409000 0.000000 0.409000 ( 0.409000)
0.455000 0.000000 0.455000 ( 0.455000)
0.428000 0.000000 0.428000 ( 0.428000)
0.474000 0.000000 0.474000 ( 0.474000)
0.470000 0.000000 0.470000 ( 0.470000)
0.458000 0.000000 0.458000 ( 0.458000)
0.495000 0.000000 0.495000 ( 0.495000)
0.460000 0.000000 0.460000 ( 0.460000)
0.508000 0.000000 0.508000 ( 0.508000)
~/projects/jruby ➔ jruby --server -Xcompile.invokedynamic=false -X+C
bench/language/bench_method_dispatch_only.rb
Test ruby method: 1000k loops calling self's foo 10 times
0.377000 0.000000 0.377000 ( 0.315000)
0.211000 0.000000 0.211000 ( 0.207000)
0.132000 0.000000 0.132000 ( 0.132000)
0.128000 0.000000 0.128000 ( 0.128000)
0.135000 0.000000 0.135000 ( 0.135000)
0.140000 0.000000 0.140000 ( 0.140000)
0.122000 0.000000 0.122000 ( 0.122000)
0.122000 0.000000 0.122000 ( 0.122000)
0.122000 0.000000 0.122000 ( 0.122000)
0.122000 0.000000 0.122000 ( 0.122000)
Previously, invokedynamic version clocked in *much* faster than the
MIC version...like an order of magnitude faster.
- Charlie
On Mon, May 23, 2011 at 4:56 PM, Charles Oliver Nutter
<headius at headius.com> wrote:
> Another example, running bench/language/bench_method_dispatch_only,
> which runs a 1m iteration loop that invokes an empty "foo" method five
> times:
>
> https://gist.github.com/9008f94fc677f3fe98e7
>
> Note again that it seems like only the test logic and maybe some of
> the logic wrapping the foo call inline...the foo calls themselves do
> not appear in logc inlining graph at all.
>
> - Charlie
>
> On Mon, May 23, 2011 at 4:50 PM, Charles Oliver Nutter
> <headius at headius.com> wrote:
>> Also, fwiw...after these two chunks in LogCompilation output, I see
>> nothing else inlined into fib_ruby, including a monomorphic call path
>> through PlusCallSite ending at RubyFixnum#op_plus (the integer +
>> operation). That would also affect performance.
>>
>> I also do not see any indication *why* nothing inlines past this
>> point. Usually it would say "too big" or something.
>>
>> I do see MinusCallSite inline earlier.
>>
>> - Charlie
>>
>> On Mon, May 23, 2011 at 4:47 PM, Charles Oliver Nutter
>> <headius at headius.com> wrote:
>>> The following chunk should be the invokedynamic call to fib, via a
>>> GWT, an arg permuter, and perhaps one convert:
>>>
>>> @ 77 java.lang.invoke.MethodHandle::invokeExact (0 bytes)
>>> @ 77 java.lang.invoke.MethodHandle::invokeExact (44 bytes)
>>> @ 8 java.lang.invoke.MethodHandle::invokeExact (0 bytes)
>>> @ 8 java.lang.invoke.MethodHandle::invokeExact (7 bytes)
>>> @ 3 org.jruby.runtime.invokedynamic.InvokeDynamicSupport::test
>>> (20 bytes)
>>> @ 5 org.jruby.RubyBasicObject::getMetaClass (5 bytes)
>>> @ 8 org.jruby.RubyModule::getCacheToken (5 bytes)
>>> @ 23 java.lang.invoke.MethodHandle::invokeExact (0 bytes)
>>> @ 23 java.lang.invoke.MethodHandle::invokeExact (67 bytes)
>>> @ 1 java.lang.Boolean::valueOf (14 bytes)
>>> @ 10 java.lang.invoke.MethodHandle::invokeExact (0 bytes)
>>> @ 10 java.lang.invoke.MethodHandle::invokeExact (24 bytes)
>>> @ 11 java.lang.Boolean::booleanValue (5 bytes)
>>> @ 20 java.lang.invoke.MethodHandleImpl::selectAlternative (10 bytes)
>>> @ 63 java.lang.invoke.MethodHandle::invokeExact (0 bytes)
>>> @ 37 sun.invoke.util.ValueConversions::identity (2 bytes)
>>>
>>> This seems to only be the test logic; the actual fib invocation
>>> doesn't appear to show up in the inlining graph at all. Am I right?
>>>
>>> I see two of these in the LogCompilation output and nothing else
>>> around them. I'd expect to see them do the invocation of fib_ruby
>>> somewhere in there. It's like the "success" branch of GWT is not even
>>> being considered for inlining.
>>>
>>> - Charlie
>>>
>>> On Mon, May 23, 2011 at 4:41 PM, Tom Rodriguez <tom.rodriguez at oracle.com> wrote:
>>>> If there were to be a recursive inline in there, where would it occur? I can't tell from the names where in that inline tree where the recursive call occurs.
>>>>
>>>> tom
>>>>
>>>> On May 23, 2011, at 2:26 PM, Charles Oliver Nutter wrote:
>>>>
>>>>> fib_ruby LogCompilation inlining graph, showing that fib_ruby is not
>>>>> inlined: https://gist.github.com/f2b665ad3c97ba622ebf
>>>>>
>>>>> Can anyone suggest other flags I can try to adjust to get things to
>>>>> inline better?
>>>>>
>>>>> FWIW, the handle chain in question that's not inlining is pretty simple:
>>>>>
>>>>> * DMH pointing back at fib_ruby
>>>>> * permute args
>>>>> * GWT
>>>>>
>>>>> - Charlie
>>>>>
>>>>> On Mon, May 23, 2011 at 4:19 PM, Charles Oliver Nutter
>>>>> <headius at headius.com> wrote:
>>>>>> I'm working up a set of files that show JRuby compilation output, but
>>>>>> I noticed a couple things that might be interesting right now.
>>>>>>
>>>>>> First off, fairly early in the assembly output for fib, I see this:
>>>>>>
>>>>>> 0x02876d1f: call 0x0282d0e0 ; OopMap{[96]=Oop [100]=Oop
>>>>>> [28]=Oop [40]=Oop [48]=Oop off=644}
>>>>>> ;*invokespecial invokeExact
>>>>>> ; -
>>>>>> java.lang.invoke.MethodHandle::invokeExact at 63
>>>>>> ; -
>>>>>> java.lang.invoke.MethodHandle::invokeExact at 23
>>>>>> ; -
>>>>>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby at 51 (line 7)
>>>>>> ; {optimized virtual_call}
>>>>>>
>>>>>> For fib, the only invokedynamic is the recursive call to fib, so that
>>>>>> would indicate that fib_ruby is not inlining into itself at all here.
>>>>>> And I can't see it inlining into itself anywhere in the assembly
>>>>>> output.
>>>>>>
>>>>>> Later in the same output:
>>>>>>
>>>>>> 0x0287703f: call 0x0282dba0 ; OopMap{ebp=Oop off=1444}
>>>>>> ;*checkcast
>>>>>> ; -
>>>>>> java.lang.invoke.MethodHandle::invokeExact at 40
>>>>>> ; -
>>>>>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby at 82 (line 7)
>>>>>> ; {runtime_call}
>>>>>> 0x02877044: call 0x0105a9d0 ;*checkcast
>>>>>> ; -
>>>>>> java.lang.invoke.MethodHandle::invokeExact at 40
>>>>>> ; -
>>>>>> bench.bench_fib_recursive::method__0$RUBY$fib_ruby at 82 (line 7)
>>>>>> ; {runtime_call}
>>>>>>
>>>>>> These appear repeatedly near the invokedynamic invocation above. If
>>>>>> I'm reading this right, neither the recursive call nor logic involved
>>>>>> in that particular handle is inlining. Am I right?
>>>>>>
>>>>>> Here's the complete assembly dump (i386) for the fib_ruby method:
>>>>>> https://gist.github.com/987640
>>>>>>
>>>>>> In other news, MaxInlineSize=150 with InlineSmallCode=3000 does not
>>>>>> appear to improve performance. I also tried bumping up
>>>>>> MaxRecursiveInlineLevel and MaxInlineLevel with no effect.
>>>>>>
>>>>>> - Charlie
>>>>>>
>>>>> _______________________________________________
>>>>> mlvm-dev mailing list
>>>>> mlvm-dev at openjdk.java.net
>>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>>>
>>>> _______________________________________________
>>>> mlvm-dev mailing list
>>>> mlvm-dev at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>>>
>>>
>>
>
More information about the mlvm-dev
mailing list