Request for review (M): 7171890: C1: add Class.isInstance intrinsic

Rémi Forax forax at univ-mlv.fr
Thu May 31 15:55:53 PDT 2012


On 06/01/2012 12:05 AM, Christian Thalinger wrote:
>
> On May 31, 2012, at 6:26 AM, Krystal Mok wrote:
>
>> On Thu, May 31, 2012 at 3:10 PM, John Rose <john.r.rose at oracle.com 
>> <mailto:john.r.rose at oracle.com>> wrote:
>>
>>     On May 30, 2012, at 7:55 PM, Krystal Mok wrote:
>>
>>>     Yes, it's doable. I'll just take the same approach for
>>>     clazz1.isAssignableFrom(clazz2).
>>
>>     It's trickier, since you can't just repurpose the C1 InstanceOf
>>     node.  It looks like you'll have to refactor machine-dependent
>>     code to cut in the new logic.
>>
>> Yes, I have noticed that already. It's not the same as the 
>> Class.isInstance(). The (quick-n-dirty) plan was to fold the case 
>> where both clazz1 and clazz2 are constants, and emit a leaf runtime 
>> call for non-constant cases. Since Class.isAssignableFrom() only 
>> throws NPE if any of clazz1 or clazz2 is null, it should be okay to 
>> make a leaf call after null-checking them.
>>
>> The complete solution, as you suggested, would have to involve 
>> changes to platform-dependent code.
>>
>>     For a comparison, see inline_native_subtype_check in C2, versus
>>     the "_isInstance" cases of inline_native_Class_query.  The
>>     intrinsic for Class.isAssignableFrom is surprisingly more complex
>>     and specialized than the intrinsic for Class.isInstance.
>>
>>     (For C2-ish reasons, the intrinsic logic in library_call.cpp is
>>     machine-independent, so it's easier to do than in C1.)
>>
>> I did notice this from the start, too. It's so tempting to add 
>> finer-grain IR to C1 so that it can do more optimizations; but that 
>> feels against the theme of C1.
>>
>>     Unless you find a simple way to manage the C1 changes, you might
>>     want to stick with isInstance only, this time around.
>>
>> Yes, I agree. I wouldn't mind making a more complete solution for C1 
>> Class intrinsics in future changes.
>>
>>     In any case, we'll try what you have done already; I am confident
>>     it will do good things for our dynamic language codes.
>>
>> Thanks, really looking forward to the numbers :-)
>
> The numbers are okay; it shaves off about 9% of run time:
>
> cthaling at intelsdv03.us.oracle.com:~/mlvm/jruby$ jruby -J-client 
> -J-showversion -X+C bench/bench_red_black.rb
> Picked up _JAVA_OPTIONS: -XX:+UnlockExperimentalVMOptions 
> -XX:-AllowChainedMethodHandles -XX:+AllowLambdaForms -Xverify:all -esa 
> -Xbootclasspath/p:/home/cthaling/mlvm/jdk/classes/
> java version "1.8.0-ea"
> Java(TM) SE Runtime Environment (build 1.8.0-ea-b40)
> Java HotSpot(TM) Client VM (build 24.0-b08-internal, mixed mode)
>
> 17.479
> GC.count = 46
> 10.25
> GC.count = 54
> 9.732
> GC.count = 61
> 9.633
> GC.count = 68
> 9.568
> GC.count = 74
> 9.499
> GC.count = 81
> 9.634
> GC.count = 87
> 9.787
> GC.count = 94
> 9.739
> GC.count = 101
> 9.702
> GC.count = 107
>
>
> Flat profile of 114.73 secs (10532 total ticks): main
>
>   Interpreted + native   Method
>   2.6%     0  +   277    java.io.FileOutputStream.open
>   2.1%     0  +   221    java.io.FileOutputStream.close0
>   0.8%    87  +     0   
>  bench.bench_red_black.method__27$RUBY$insert_helper
>   0.6%    65  +     0    bench.bench_red_black.method__16$RUBY$minimum
>   0.1%     0  +    15    java.lang.Class.forName0
>   0.1%     0  +    12    sun.misc.Unsafe.defineAnonymousClass
>   0.1%     1  +    11    org.jruby.Ruby.initCore
>   0.1%    10  +     0    bench.bench_red_black.method__25$RUBY$left_rotate
>   0.1%     7  +     0    bench.bench_red_black.method__14$RUBY$insert
>   0.1%     3  +     3    java.lang.Class.getDeclaredConstructors0
>   0.1%     0  +     6    org.jruby.parser.Ruby19Parser.<clinit>
>   0.0%     0  +     5    java.io.UnixFileSystem.getBooleanAttributes0
>   0.0%     2  +     3    java.lang.ClassLoader.defineClass1
>   0.0%     1  +     2    java.security.AccessController.doPrivileged
>   0.0%     3  +     0    org.objectweb.asm.Label.a
>   0.0%     3  +     0    java.lang.invoke.Invokers.lookupInvoker
>   0.0%     3  +     0   
>  com.sun.xml.internal.ws.org.objectweb.asm.ClassWriter.<init>
>   0.0%     0  +     2    java.io.FileInputStream.open
>   0.0%     0  +     2    java.lang.invoke.MethodHandleNatives.resolve
>   0.0%     0  +     2    java.lang.Throwable.fillInStackTrace
>   0.0%     0  +     2    java.lang.ClassLoader.findBootstrapClass
>   0.0%     0  +     2    org.jruby.parser.Ruby19Parser.<init>
>   0.0%     2  +     0    org.jruby.util.ByteList.bytes
>   0.0%     2  +     0   
>  org.jruby.internal.runtime.methods.InvocationMethodFactory.invokeCallConfigPost
>   0.0%     2  +     0    org.objectweb.asm.Frame.d
>   8.4%   277  +   606    Total interpreted (including elided)
>
>      Compiled + native   Method
>   2.8%     1  +   295    bench.bench_red_black.method__16$RUBY$minimum
>   2.5%     3  +   261   
>  bench.bench_red_black.method__27$RUBY$insert_helper
>   1.5%     0  +   163    bench.bench_red_black.method__25$RUBY$left_rotate
>   1.0%     0  +   109    bench.bench_red_black.method__14$RUBY$insert
>   0.0%     0  +     1    java.io.FilterInputStream.read
>   0.0%     1  +     0    org.jruby.runtime.CompiledBlockLight19.pre
>   0.0%     0  +     1   
>  org.jruby.internal.runtime.methods.CompiledMethod.call
>   0.0%     1  +     0   
>  org.jruby.runtime.invokedynamic.InvocationLinker.testRealClass
>   0.0%     1  +     0    org.jruby.RubyBasicObject.op_not_equal
>   0.0%     0  +     1   
>  org.jruby.internal.runtime.methods.DynamicMethod.call
>   0.0%     1  +     0   
>  org.jruby.runtime.scope.ManyVarsDynamicScope.getValueOrNil
>   0.0%     1  +     0    java.lang.invoke.LambdaForm$LFI39.invoke
>   0.0%     1  +     0   
>  org.jruby.ast.executable.RuntimeCache.isCachedFrom
>   0.0%     0  +     1   
>  org.jruby.javasupport.util.RuntimeHelpers.restructureBlockArgs19
>   0.0%     1  +     0    org.objectweb.asm.ClassWriter.toByteArray
>   0.0%     0  +     1    bench.bench_red_black.method__17$RUBY$maximum
>   0.0%     0  +     1    java.lang.String.replace
>   0.0%     1  +     0    bench.bench_red_black.method__2$RUBY$initialize
>   0.0%     1  +     0   
>  bench.bench_red_black.method__26$RUBY$right_rotate
>   8.0%    13  +   834    Total compiled
>
>          Stub + native   Method
>  80.6%     0  +  8494    java.lang.Class.isInstance
>   0.8%     1  +    87    java.io.FileOutputStream.open
>   0.8%     0  +    85    java.io.FileOutputStream.close0
>   0.3%     0  +    29    java.lang.Class.isPrimitive
>   0.1%     0  +    15    sun.misc.Unsafe.ensureClassInitialized
>   0.1%     0  +     9    java.lang.String.intern
>   0.1%     0  +     8    java.lang.Class.isArray
>   0.1%     0  +     7    java.lang.invoke.MethodHandleNatives.resolve
>   0.0%     2  +     3    java.security.AccessController.doPrivileged
>   0.0%     0  +     5    java.lang.System.arraycopy
>   0.0%     0  +     4    java.lang.Class.isInterface
>   0.0%     0  +     3    java.lang.Thread.currentThread
>   0.0%     0  +     3    java.lang.Throwable.fillInStackTrace
>   0.0%     0  +     3    sun.misc.Unsafe.defineAnonymousClass
>   0.0%     0  +     2    java.lang.Object.getClass
>   0.0%     0  +     2    java.lang.Class.getComponentType
>   0.0%     0  +     2    java.lang.ClassLoader.findLoadedClass0
>   0.0%     0  +     2    java.io.FileOutputStream.writeBytes
>   0.0%     0  +     1    java.lang.Thread.holdsLock
>   0.0%     0  +     1    sun.reflect.Reflection.getCallerClass
>   0.0%     0  +     1    java.lang.Class.getEnclosingMethod0
>   0.0%     0  +     1    java.lang.Object.clone
>   0.0%     0  +     1    java.lang.reflect.Array.newArray
>   0.0%     0  +     1   
>  java.lang.invoke.MethodHandleNatives.setCallSiteTargetNormal
>   0.0%     0  +     1    sun.misc.Unsafe.compareAndSwapInt
>  83.3%     3  +  8771    Total stub (including elided)
>
>
>
> cthaling at intelsdv03.us.oracle.com:~/mlvm/jruby$ jruby -J-client 
> -J-showversion -X+C bench/bench_red_black.rb
> Picked up _JAVA_OPTIONS: -XX:+UnlockExperimentalVMOptions 
> -XX:-AllowChainedMethodHandles -XX:+AllowLambdaForms -Xverify:all -esa 
> -Xbootclasspath/p:/home/cthaling/mlvm/jdk/classes/
> java version "1.8.0-ea"
> Java(TM) SE Runtime Environment (build 1.8.0-ea-b40)
> Java HotSpot(TM) Client VM (build 24.0-b08-internal, mixed mode)
>
> 15.966
> GC.count = 46
> 9.629
> GC.count = 54
> 9.135
> GC.count = 61
> 8.905
> GC.count = 68
> 8.775
> GC.count = 74
> 8.851
> GC.count = 80
> 8.782
> GC.count = 87
> 8.78
> GC.count = 93
> 8.794
> GC.count = 100
> 8.81
> GC.count = 106
>
>
> Flat profile of 108.49 secs (9711 total ticks): main
>
>   Interpreted + native   Method
>  14.3%  1387  +     0   
>  bench.bench_red_black.method__27$RUBY$insert_helper
>  13.3%  1294  +     0    bench.bench_red_black.method__16$RUBY$minimum
>   3.1%     0  +   303    java.io.FileOutputStream.open
>   2.2%     0  +   210    java.io.FileOutputStream.close0
>   1.8%   170  +     0    bench.bench_red_black.method__14$RUBY$insert
>   0.1%     0  +    13    java.lang.Class.forName0
>   0.1%     0  +    12    sun.misc.Unsafe.defineAnonymousClass
>   0.1%     2  +    10    org.jruby.Ruby.initCore
>   0.1%     0  +     8    java.io.UnixFileSystem.getBooleanAttributes0
>   0.1%     1  +     6    org.jruby.parser.Ruby19Parser.<clinit>
>   0.1%     7  +     0   
>  bench.bench_red_black.method__28$RUBY$delete_fixup
>   0.1%     4  +     2    java.lang.Class.getDeclaredConstructors0
>   0.1%     2  +     4    java.lang.ClassLoader.defineClass1
>   0.1%     6  +     0   
>  org.jruby.internal.runtime.methods.DynamicMethod.call
>   0.1%     5  +     0    org.jruby.RubyArray.eachCommon
>   0.1%     5  +     0    org.jruby.RubyArray.realloc
>   0.0%     0  +     4    java.lang.ClassLoader.findBootstrapClass
>   0.0%     0  +     4    java.io.FileOutputStream.writeBytes
>   0.0%     0  +     3    java.lang.invoke.MethodHandleNatives.resolve
>   0.0%     3  +     0    bench.bench_red_black.method__25$RUBY$left_rotate
>   0.0%     0  +     3    org.jruby.Ruby.initRoot
>   0.0%     0  +     2    java.lang.Throwable.fillInStackTrace
>   0.0%     2  +     0   
>  org.jruby.compiler.impl.BaseBodyCompiler.invokeUtilityMethod
>   0.0%     2  +     0   
>  org.jruby.runtime.opto.OptoFactory.newInvocationCompiler
>   0.0%     2  +     0   
>  org.jruby.compiler.impl.InvokeDynamicCacheCompiler.cacheClosure19
>  37.6%  3024  +   623    Total interpreted (including elided)
>
>      Compiled + native   Method
>  19.7%   321  +  1592   
>  bench.bench_red_black.method__27$RUBY$insert_helper
>  17.1%    79  +  1586    bench.bench_red_black.method__25$RUBY$left_rotate
>   5.4%   147  +   374    bench.bench_red_black.method__14$RUBY$insert
>   4.5%    27  +   409    bench.bench_red_black.method__16$RUBY$minimum
>   3.8%   371  +     0    bench.bench_red_black.method__22$RUBY$search
>   1.4%   133  +     0    bench.bench_red_black.method__17$RUBY$maximum
>   1.1%   102  +     0    org.jruby.RubyBasicObject.op_not_equal
>   0.9%    90  +     0    org.jruby.RubyBasicObject.setVariable
>   0.6%    56  +     0   
>  org.jruby.ast.executable.RuntimeCache.isCachedFrom
>   0.5%    53  +     0    org.jruby.runtime.CompiledBlockLight19.pre
>   0.5%    49  +     0    org.jruby.runtime.CompiledBlock19.yield
>   0.3%    28  +     0    org.jruby.runtime.ThreadContext.pushCallFrame
>   0.3%    25  +     0   
>  org.jruby.runtime.scope.ManyVarsDynamicScope.getValueOrNil
>   0.2%    24  +     0    org.jruby.MetaClass.getRealClass
>   0.2%    21  +     0    bench.bench_red_black.method__18$RUBY$successor
>   0.2%    16  +     0    org.jruby.RubyFixnum.op_plus_one
>   0.2%    16  +     0    bench.bench_red_black.method__19$RUBY$predecessor
>   0.1%    14  +     0    bench.bench_red_black.method__2$RUBY$initialize
>   0.1%    13  +     0    org.jruby.RubyObject$1.allocate
>   0.1%    13  +     0    bench$bench_red_black$method__13$RUBY$add.call
>   0.1%    13  +     0   
>  bench.bench_red_black.method__28$RUBY$delete_fixup
>   0.1%    11  +     0    org.jruby.RubyRandom.randCommon19
>   0.1%    10  +     0   
>  java.util.concurrent.atomic.AtomicReferenceFieldUpdater$AtomicReferenceFieldUpdaterImpl.compareAndSet
>   0.1%     9  +     1    org.jruby.RubyFixnum.times
>   0.1%    10  +     0   
>  bench.bench_red_black.method__20$RUBY$inorder_walk
>  59.5%  1801  +  3974    Total compiled (including elided)
>
>          Stub + native   Method
>   0.9%     0  +    86    java.io.FileOutputStream.open
>   0.7%     0  +    71    java.io.FileOutputStream.close0
>   0.2%     0  +    22    java.lang.Class.isPrimitive
>   0.2%     0  +    15    sun.misc.Unsafe.ensureClassInitialized
>   0.1%     0  +    12    java.lang.Class.isArray
>   0.1%     0  +     7    java.lang.invoke.MethodHandleNatives.resolve
>   0.1%     0  +     5    java.lang.String.intern
>   0.1%     0  +     5    java.lang.Class.isInterface
>   0.0%     0  +     4    java.lang.Class.getComponentType
>   0.0%     0  +     4    java.lang.Object.clone
>   0.0%     1  +     3    java.security.AccessController.doPrivileged
>   0.0%     0  +     4    sun.misc.Unsafe.defineAnonymousClass
>   0.0%     0  +     3    java.lang.System.arraycopy
>   0.0%     0  +     2    java.lang.ClassLoader.findLoadedClass0
>   0.0%     0  +     2    java.util.zip.Inflater.inflateBytes
>   0.0%     0  +     1    java.lang.Object.getClass
>   0.0%     0  +     1    java.lang.Class.getDeclaringClass
>   0.0%     0  +     1    java.lang.Class.isAssignableFrom
>   0.0%     0  +     1    java.lang.Throwable.fillInStackTrace
>   0.0%     0  +     1    java.security.AccessController.doPrivileged
>   0.0%     0  +     1    sun.misc.Unsafe.getObjectVolatile
>   0.0%     0  +     1    sun.misc.Unsafe.getInt
>   0.0%     0  +     1    java.io.FileOutputStream.writeBytes
>   2.6%     1  +   253    Total stub
>
> The problem why we don't see a much bigger improvement is that most of 
> the time (I'd say 99% of the time) Class.cast isn't inlined in C1 
> because it's too big, e.g.:
>
>                                   @ 33   
> java.lang.invoke.LambdaForm$LFI67/25137260::invoke (32 bytes)
>                                     @ 15   
> java.lang.invoke.LambdaForm$LFI1/13330648::invoke (31 bytes)
>                                       @ 27   
> sun.invoke.util.ValueConversions::castReference (6 bytes)
>                                         @ 2   java.lang.Class::cast 
> (27 bytes)   callee is too large
>                                     @ 28   
> org.jruby.RubyBasicObject::op_not (20 bytes)
>                                       @ 1   
> org.jruby.runtime.ThreadContext::getRuntime (5 bytes)
>                                       @ 5   
> org.jruby.RubyBasicObject::isTrue (15 bytes)
>                                       @ 16   
> org.jruby.Ruby::newBoolean (16 bytes)
>
> and we have to go the out-of-line route.
>
> -- Chris

There are several solutions:
  - Charles should use invokedynamic for the unary operator '!',
    so there is no need to cast to a ThreadContext anymore
  - The cast should not be needed anyway. I have no idea if
    it's because the threadContext is not declared as ThreadContext in 
invokedynamic
    or if your implementation is not able to see that the threadContext 
object is never
    altered so its type is constant in the method handle blob, it's 
maybe because the
    blob is fully specified in Java (there is no invokeExact anymore ?)
  - the inlining algorithm should not use the bytecode size but a 
top-bottom algorithm
    that doesn't count all codes that throws an exception and their 
dependency (roughly)
    or if the code is part of an exception handler
    (if the classical path and the exception handler path are not shared).
  - the method metadata (profile) should not be tied to a method so when 
a method handle
    blob is interpreted, it can have it's own profile so you will not 
try to inline the code
    in the slow path together with the fast path (it will also help to 
de-virtualize in tiered
    compilation mode).

Rémi



More information about the hotspot-compiler-dev mailing list