Good news, bad news
Charles Oliver Nutter
headius at headius.com
Fri May 20 16:31:39 PDT 2011
I am still using the specialized math callsites because they inline well and an Indy-based approach was still quite a bit slower last time I tried.
I am back to being worried about the inlining thresholds. With no MHs and runtime profiled direct calls, I was able to get this all to inline. So basically the fact that I am using Indy and MHs now penalizes me over not just profiled direct calls, but over a totally generic inline cache that usually doesn't inline at all. That's pretty bad :)
I can try fiddling with the thresholds later, but do we agree that an invokedynamic bound to a GWT plus some intermediate adapter handles ending in a direct handle should inline as well as e.g. an invokevirtual?
- Charlie (mobile)
On May 20, 2011, at 17:35, Rémi Forax <forax at univ-mlv.fr> wrote:
> On 05/20/2011 11:08 PM, Tom Rodriguez wrote:
>> Is it the inlining policy? I played a little with this and the current default policy is still not as good as the big tweaks to MaxInlineSize and InlineSmallCode.
>>
>> tom
>
> I tend to agree,
> things like isFixnumReopened and getCacheToken seem to be not inlined.
>
> Charles, do you have tested if the inline flags are tweaked ?
>
> Also I do not understand why you have JRuby's callsites like MinusCallSite
> in the stacktrace. You should not use it if invokedynamic is enabled.
>
> Rémi
>
>> On May 20, 2011, at 1:21 PM, Charles Oliver Nutter wrote:
>>
>>> Meh, false alarm. This is just the default empty Object[] assigned to
>>> all objects. I still can't explain the perf degradation.
>>>
>>> Here's the full assembly for fib_ruby: https://gist.github.com/983721
>>>
>>> It seems like stuff is inlining, so I have no idea why performance is
>>> so terrible.
>>>
>>> - Charlie
>>>
>>> On Fri, May 20, 2011 at 2:59 PM, Charles Oliver Nutter
>>> <headius at headius.com> wrote:
>>>> Another update...I managed to turn off the ricochet frames and have it
>>>> still work (???). The with-ricochet mode is faster, but still 2-3x
>>>> slower than no indy at all (using JRuby's IC), and 4-5x slower than
>>>> invokedynamic was recently...
>>>>
>>>> HOWEVER...an assembly dump I just did contains some very odd code that
>>>> might indicate a problem in my code. Hold the presses on digging into
>>>> it on your end until I can explain this. (this = apparently standing
>>>> up an instance variable table for Fixnum objects...*definitely* should
>>>> not be happening).
>>>>
>>>> 0x02868f2a: mov [eax+0x1C], ebx ;*invokespecial<init>
>>>> ; -
>>>> org.jruby.RubyBasicObject::<init>@1 (line 218)
>>>> ; -
>>>> org.jruby.RubyObject::<init>@2 (line 116)
>>>> ; -
>>>> org.jruby.RubyNumeric::<init>@2 (line 114)
>>>> ; -
>>>> org.jruby.RubyInteger::<init>@2 (line 95)
>>>> ; -
>>>> org.jruby.RubyFixnum::<init>@5 (line 112)
>>>> ; -
>>>> org.jruby.RubyFixnum::newFixnum at 25 (line 173)
>>>> ; -
>>>> org.jruby.RubyFixnum::op_minus at 34 (line 395)
>>>> ; -
>>>> org.jruby.runtime.callsite.MinusCallSite::call at 24 (line 16)
>>>> 0x02868f2d: mov ebx, [esp+0x14]
>>>> 0x02868f31: mov ecx, [ebx+0xBC]
>>>> 0x02868f37: mov edx, eax
>>>> 0x02868f39: shr edx, 9
>>>> 0x02868f3c: mov [edx+0x394000], 0x00 ;*putfield varTable
>>>> ; -
>>>> org.jruby.RubyBasicObject::<init>@8 (line 84)
>>>> ; -
>>>> org.jruby.RubyObject::<init>@2 (line 116)
>>>> ; -
>>>> org.jruby.RubyNumeric::<init>@2 (line 114)
>>>> ; -
>>>> org.jruby.RubyInteger::<init>@2 (line 95)
>>>> ; -
>>>> org.jruby.RubyFixnum::<init>@5 (line 112)
>>>> ; -
>>>> org.jruby.RubyFixnum::newFixnum at 25 (line 173)
>>>> ; -
>>>> org.jruby.RubyFixnum::op_minus at 34 (line 395)
>>>> ; -
>>>> org.jruby.runtime.callsite.MinusCallSite::call at 24 (line 16)
>>>> 0x02868f43: mov edx, a 'java/lang/Class' = 'org/jruby/RubyBasicObject'
>>>> ; {oop(a 'java/lang/Class' =
>>>> 'org/jruby/RubyBasicObject')}
>>>> 0x02868f48: mov ebx, [edx+0x68] ;*getstatic NULL_OBJECT_ARRAY
>>>> ; -
>>>> org.jruby.RubyBasicObject::<init>@5 (line 84)
>>>> ; -
>>>> org.jruby.RubyObject::<init>@2 (line 116)
>>>> ; -
>>>> org.jruby.RubyNumeric::<init>@2 (line 114)
>>>> ; -
>>>> org.jruby.RubyInteger::<init>@2 (line 95)
>>>> ; -
>>>> org.jruby.RubyFixnum::<init>@5 (line 112)
>>>> ; -
>>>> org.jruby.RubyFixnum::newFixnum at 25 (line 173)
>>>> ; -
>>>> org.jruby.RubyFixnum::op_minus at 34 (line 395)
>>>> ; -
>>>> org.jruby.runtime.callsite.MinusCallSite::call at 24 (line 16)
>>>> 0x02868f4b: mov [eax+0x10], ebx
>>>> 0x02868f4e: lock add [esp], 0x00 ;*putfield varTable
>>>> ; -
>>>> org.jruby.RubyBasicObject::<init>@8 (line 84)
>>>> ; -
>>>> org.jruby.RubyObject::<init>@2 (line 116)
>>>> ; -
>>>> org.jruby.RubyNumeric::<init>@2 (line 114)
>>>> ; -
>>>> org.jruby.RubyInteger::<init>@2 (line 95)
>>>> ; -
>>>> org.jruby.RubyFixnum::<init>@5 (line 112)
>>>> ; -
>>>> org.jruby.RubyFixnum::newFixnum at 25 (line 173)
>>>> ; -
>>>> org.jruby.RubyFixnum::op_minus at 34 (line 395)
>>>> ; -
>>>> org.jruby.runtime.callsite.MinusCallSite::call at 24 (line 16)
>>>>
>>>> On Fri, May 20, 2011 at 8:10 AM, Charles Oliver Nutter
>>>> <headius at headius.com> wrote:
>>>>> On Fri, May 20, 2011 at 1:07 AM, Ola Bini<ola.bini at gmail.com> wrote:
>>>>>>> Try -XX:+UnlockDiag* -XX:-UseRicochetFrames for differential testing.
>>>>>>>
>>>>>>> If you can find a microbenchmark which shows differential slowdown, we can debug it.
>>>>>> When trying the above all my tests fail with this:
>>>>>> java.lang.RuntimeException: java.lang.InternalError: NYI
>>>>> Same here.
>>>>>
>>>>> Reproducing it is pretty easy...
>>>>>
>>>>> * Clone JRuby: git clone https://github.com/jruby/jruby.git
>>>>> * Build JRuby: cd jruby ; ant
>>>>> * Run benchmarks: bin/jruby bench/bench_fib_recursive.rb 10 35
>>>>>
>>>>> There are many benchmarks under the bench dir, almost all of which
>>>>> show this degradation. You can disable invokedynamic use in JRuby by
>>>>> passing -Xcompile.invokedynamic=false, as seen here:
>>>>>
>>>>>
>>>>> ~/projects/jruby ➔ jruby -Xcompile.invokedynamic=false
>>>>> bench/bench_fib_recursive.rb 5 35
>>>>> WARNING: Both jruby-complete.jar and jruby.jar are present in the
>>>>> 'lib' directory. Will use jruby.jar
>>>>> 9227465
>>>>> 1.698000 0.000000 1.698000 ( 1.645000)
>>>>> 9227465
>>>>> 1.352000 0.000000 1.352000 ( 1.352000)
>>>>> 9227465
>>>>> 1.364000 0.000000 1.364000 ( 1.364000)
>>>>> 9227465
>>>>> 1.344000 0.000000 1.344000 ( 1.344000)
>>>>> 9227465
>>>>> 1.336000 0.000000 1.336000 ( 1.336000)
>>>>>
>>>>> ~/projects/jruby ➔ jruby -Xcompile.invokedynamic=true
>>>>> bench/bench_fib_recursive.rb 5 35
>>>>> WARNING: Both jruby-complete.jar and jruby.jar are present in the
>>>>> 'lib' directory. Will use jruby.jar
>>>>> 9227465
>>>>> 3.182000 0.000000 3.182000 ( 3.120000)
>>>>> 9227465
>>>>> 3.019000 0.000000 3.019000 ( 3.019000)
>>>>> 9227465
>>>>> 3.195000 0.000000 3.195000 ( 3.195000)
>>>>> 9227465
>>>>> 3.163000 0.000000 3.163000 ( 3.163000)
>>>>> 9227465
>>>>> 3.656000 0.000000 3.656000 ( 3.656000)
>>>>>
>>>>> I'm sure something's just broken...this benchmark was easily almost 2x
>>>>> faster than the non-invokedynamic version just a few days ago, and
>>>>> it's now 2-3x slower than non-indy logic.
>>>>>
>>>>> I'm going to be online all day, standing by to help in any way I can.
>>>>> I did do a quick LogCompilation last night and the main thing I
>>>>> noticed was that while it seemed like handles are inlining, the
>>>>> eventual DMH does not inline. That would explain some of the perf
>>>>> degradation, I think...but hopefully not all of it.
>>>>>
>>>>> Whatever you need me to run or whatever information...let me know.
>>>>>
>>>>> ALSO...I got the same error as Ola when running JRuby with a
>>>>> standalone jar file:
>>>>>
>>>>> ~/projects/jruby ➔ java -jar lib/jruby-complete.jar bench/bench_fib_recursive.rb
>>>>> MethodHandle.java:-1:in `invokeExact': java.lang.NoClassDefFoundError:
>>>>> org/jruby/runtime/ThreadContext
>>>>> from MethodHandle.java:-1:in `invokeExact'
>>>>> from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby'
>>>>> from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby'
>>>>> from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby'
>>>>> from bench/bench_fib_recursive.rb:7:in `method__0$RUBY$fib_ruby'
>>>>>
>>>>> As in Ola's case, ThreadContext is a core JRuby class. I do *not* get
>>>>> this error when JRuby is loaded from the bootstrap classloader, so I
>>>>> imagine there's some mix-up with generated bytecode and the
>>>>> classloaders that that bytecode gets loaded into.
>>>>>
>>>>> - Charlie
>>>>>
>>> _______________________________________________
>>> mlvm-dev mailing list
>>> mlvm-dev at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>> _______________________________________________
>> mlvm-dev mailing list
>> mlvm-dev at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
More information about the mlvm-dev
mailing list