More performance explorations

Tom Rodriguez tom.rodriguez at oracle.com
Fri Jun 3 16:15:27 PDT 2011


On Jun 2, 2011, at 7:37 PM, John Rose wrote:

> Thanks; I'll look at your dump later tonight.
> 
> If the problem is friction from interface casts, we can probably remove them.  It's hard to figure out how they are getting in, though.  It happens when IRubyObject interconverts with Object.

So I put in a little hack to fold repeated interface checkcasts and that gets back a lot of the performance.  With fib on my machine dynopt=true reports 1.005000, invokedynamic=true reports 1.293000 and turning on my checkcast hack gets it down to 1.112000.  Unfortunately what I've got right now isn't really suitable for inclusion in the JDK7.

John, I noticed that it looks like MethodHandleWalk is injecting them for return values, thought it's somewhat inconsistent.  For instance, I see this:

        // FIXME: consider inlining the invokee at the bytecode level                                                                                
        ArgToken ret = make_invoke(methodOop(invoker), vmIntrinsics::_none,
                                   Bytecodes::_invokevirtual, false, 1+argc, &arglist[0], CHECK_(empty));
        DEBUG_ONLY(invoker = NULL);
        if (rtype == T_OBJECT) {
          klassOop rklass = java_lang_Class::as_klassOop( java_lang_invoke_MethodType::rtype(recursive_mtype()) );
          if (rklass != SystemDictionary::Object_klass() &&
              !Klass::cast(rklass)->is_interface()) {
            // preserve type safety                                                                                                                  
            ret = make_conversion(T_OBJECT, rklass, Bytecodes::_checkcast, ret, CHECK_(empty));
          }
        }

but down in make_invoke itself we do this:

    switch (_rtype) {
    case T_BOOLEAN: case T_BYTE: case T_CHAR: case T_SHORT:
    case T_INT:    emit_bc(Bytecodes::_ireturn); break;
    case T_LONG:   emit_bc(Bytecodes::_lreturn); break;
    case T_FLOAT:  emit_bc(Bytecodes::_freturn); break;
    case T_DOUBLE: emit_bc(Bytecodes::_dreturn); break;
    case T_VOID:   emit_bc(Bytecodes::_return);  break;
    case T_OBJECT:
      if (_rklass.not_null() && _rklass() != SystemDictionary::Object_klass())
        emit_bc(Bytecodes::_checkcast, cpool_klass_put(_rklass()));
      emit_bc(Bytecodes::_areturn);

This results in adapter bytecodes that look like this:

0 aload_1
1 aload #4
3 aload #5
5 aload_2
6 aload #6
8 invokevirtual 7 <org/jruby/internal/runtime/methods/DynamicMethod.call(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/RubyModule;Ljava/lang/String;)Lorg/jruby/runtime/builtin/IRubyObject;>
  0   bci: 8    VirtualCallData     count(10000) entries(0)
11 checkcast 8 <org/jruby/runtime/builtin/IRubyObject>
  24  bci: 11   ReceiverTypeData    count(10000) entries(0)
14 areturn

which seems fairly pointless.  These don't seem to be the source of the checkcasts in jruby though.  They seem to be explicitly part of the method handle chain.  For this chain:

0xeff0d808: adapter: arg_slot 0 conversion op check_cast (LLLLL)L
0xeff0d7a8: adapter: arg_slot 1 conversion op check_cast (LLLLL)L
0xeff0d748: adapter: arg_slot 2 conversion op check_cast (LLLLL)L
0xeff0d6e8: adapter: arg_slot 3 conversion op check_cast (LLLLL)L
0xeff0d688: adapter: arg_slot 4 conversion op check_cast (LLLLL)L
0xeff0d2b8: adapter: arg_slot 1 conversion op drop_args pushes -1 (LLLLL)L
0xeff0d1a8: adapter: arg_slot 2 conversion op drop_args pushes -1 (LLLL)L
0xeff0acd8: bound: arg_type object arg_slot 0 instance org.jruby.runtime.Block (LLL)L
0xeff0ac68: bound: arg_type object arg_slot 4 instance bench.bench_fib_recursive (LLLL)L

we produce these bytecodes:

0 aload #5
2 checkcast 3 <org/jruby/runtime/builtin/IRubyObject>
  0   bci: 2    ReceiverTypeData    count(31244) entries(0)
5 astore #5
7 aload #4
9 checkcast 4 <java/lang/String>
  24  bci: 9    ReceiverTypeData    count(31244) entries(0)
12 astore #4
14 aload_3
15 checkcast 5 <org/jruby/runtime/builtin/IRubyObject>
  48  bci: 15   ReceiverTypeData    count(31244) entries(0)
18 astore_3
19 aload_2
20 checkcast 6 <org/jruby/runtime/builtin/IRubyObject>
  72  bci: 20   ReceiverTypeData    count(31244) entries(0)
23 astore_2
24 aload_1
25 checkcast 7 <org/jruby/runtime/ThreadContext>
  96  bci: 25   ReceiverTypeData    count(31244) entries(0)
28 astore_1
29 ldc <Object> 0xefe59f88
31 aload_1
32 aload_3
33 aload #5
35 ldc <Object> 0xefabd418
37 invokestatic 14 <bench/bench_fib_recursive.method__0$RUBY$fib_ruby(Lbench/bench_fib_recursive;Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runti\
me/builtin/IRubyObject;Lorg/jruby/runtime/builtin/IRubyObject;Lorg/jruby/runtime/Block;)Lorg/jruby/runtime/builtin/IRubyObject;>
  120 bci: 37   CounterData         count(31244)
40 areturn

Just blindly skipping checkcast method handles for interface types bring the time on fib down to 1.071000.

tom

> Are you doing it, or is it coming from inside the java.lang.invoke classes?  That's the first question.  It probably comes from an asType, but some asType calls are implicit within the 292 API.  What asType calls (was convertArguments) are in your code?
> 
> Key fact:  asType/convertArguments used to allow interfaces a free unchecked pass into and out of Object.  Now only explicitCastArguments does this.  If you convert between an interface and Object, you'll get a real checkcast (and a potential CCE) from asType.
> 
> Try changing convertArguments globally to explicitCastArguments and see what happens.
> 
> On Jun 2, 2011, at 5:15 PM, Charles Oliver Nutter wrote:
> 
>>> The door is about closed on JDK7 code changes (and the backed-off GWT stuff is in).  I can help you work around weaknesses in the JDK7 code, by adjusting the JRuby code.  Not optimal, but maybe practical.
>> 
>> I'm quite happy to do so! Did SwitchPoint optimization make it in?
> 
> Yes, but it's painfully late.  <grumble>You realize that for every power user like you who can use it correctly there will be 10 people who find it via auto-complete and write bad code.  (If you act as if a 'true' value of isValid is trustworthy, you will probably write a race condition into your code.)  Some of us on the EG *really* want to avoid having users (not you) shoot themselves in the foot.  IBM (Dan) is strongly lobbying to at least change the name of the predicate to 'hasBeenInvalidated' so it looks more like what it really is, a tricky asymmetric effectively-volatile indirectly-mutable-but-monotonic boolean.  (Your comments?  Got an easy fix for this issue?)</grumble>
> 
> And just to make sure:  Are you positive you need it?  (<grimace/>I'm convinced, but it's one of those safety-vs.power problems.  Your most telling point is that the object has the data inside itself, often, and it's just rude to users not to share.)  It gets exponentially more painful to make changes as each week passes!  We have to get this right soon, as in two weeks ago.
> 
> But the answer to your question is, yes; SP.isValid is under review (in the hsx/hotspot-comp repo) for proposed inclusion in b145.
> 
> -- John
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev



More information about the mlvm-dev mailing list