The Great Startup Problem

Tue Sep 9 00:27:04 UTC 2014

JRuby loads about 4000 own classes (above 1000 of system classes) during 
execution of just '-e 1'. It is a lot of data to load, parse, verify.
I played with CDS (Class Data Sharing) which includes jruby classes. We 
can do that since jruby.jar is on boot class path but it requires some 
manual steps.

I got next data:

jruby-1.7.13$ time bin/jruby -J-Xshare:off -e 1
real	0m1.344s
user	0m3.688s
sys	0m0.182s

jruby-1.7.13$ time bin/jruby -J-Xshare:on -e 1
real	0m1.010s
user	0m2.918s
sys	0m0.153s

With C1 it is even smaller:

jruby-1.7.13$ time bin/jruby -J-Xshare:on -J-XX:TieredStopAtLevel=1 -e 1
real	0m0.835s
user	0m1.164s
sys	0m0.112s

Regards,
Vladimir K

On 9/2/14 10:54 AM, Vladimir Ivanov wrote:
> Charlie,
>
>>> Is it acceptable and solves the problem for you?
>>
>> This is acceptable for JRuby. Our worst-case Ruby method handle chain
>> will include at most:
>>
>> * Two CatchExceptions for pre/post logic (heap frames, etc). Perf of
>> CatchException compared to literal Java try/catch is important here.
>> * Up to two permute arguments for differing call site/target argument
>> ordering.
>> * Varargs negotiation (may be a couple handles)
>> * GWT
>> * SwitchPoint
>> * For Ruby to Java calls, each argument plus the return value must be
>> filtered to convert to/from Ruby types or apply an IRubyObject wrapper
>>
>> This is worst case, mind you. Most calls in the system will be
>> arity-matched, eliminating the permutes. Most calls will be three or
>> fewer arguments, eliminating varargs. Many calls will be optimized to
>> no longer need a heap frame, eliminating the try/finally. The absolute
>> minimum for any call would be SwitchPoint plus GWT.
>>
>> Of course I'm not counting DMHs here, since they're either the call we
>> want to make or they're leaf logic.
> Thanks for the data! That's good!
>
>>> We discussed an idea to generate custom bytecodes (single method) for
>>> the
>>> whole method handle chain (and have only 1 extra stack frame per MH
>>> invocation), but it defeats memory footprint reduction we are trying to
>>> archieve with LambdaForm sharing.
>>
>> Funny thing...because indy slows our startup and increases our warmup
>> time, we're using our old binding logic by default. And surprise
>> surprise, our old binding logic does exactly this...one small
>> generated invoker class per method. I'm sure you're right that this
>> approach defeats the sharing and memory reduction we'd like to see
>> from LFs, but it works *really* well if you're ok with the extra class
>> and metaspace data in memory.
> I see one problem with pre-compiling method handle trees.
> Every tree should be compiled as a whole, so fast path and slow path are
> always compiled. Without explicit hints or profiling and recompilation
> it's impossible to distinguish them.
>
> Comparing with MethodHandle/LambdaForm compilation unit, where slow path
> usually stays interpreted on LF level (due to invocation threshold), for
> considerably large method handle trees memory overhead can be larger.
>
> But I'm just guessing here - I don't have any statistics yet neither on
> average size of method handle trees nor numbers on memory overhead
> induced by individual classes.
>
>> So there's one question: is the cost of a bytecoded adapter shim for
>> each method object really that high? Yes, if you're spinning new MHs
>> constantly or doing a million different adaptations of a given method.
>> But if you're just lazily creating an invoker shim once per method,
>> that really doesn't seem like a big deal.
> Good question. I have a prototype of LF inlining during bytecode
> translation. I'll conduct some experiments to gather some data.
>
>> My indy binding logic also has a dozen different flags for tweaking. I
>> can easily modify it to avoid doing all that pre/post logic and
>> argument permutation in the MH chain and just bind directly to the
>> generated invoker. Best (or worst) of both worlds? I just really don't
>> want to have to do that...I want everything from call site to target
>> method body to be in the MH chain.
>>
>> For JRuby 9000, all try/finally logic will be within the target
>> method, so at least that part of the MH chain goes away.
>>
>> Here's another idea...
>>
>> We've been using my InvokeBinder library heavily in JRuby. It provides
>> a Java API/DSL for creating MH chains lazily from the top down:
>>
>> MethodHandle mh = Binder.from(String.class, Object.class, Float.class)
>>          .tryFinally(finallyLogic)
>>          .permute(1, 0)
>>          .append("Hello")
>>          .drop(1)
>>          .invokeStatic(MyClass.class, "someMethod");
>>
>> The adaptations are gathered within the Binder instance, playing
>> forward as you add adaptations and played backward at binding time to
>> make the appropriate MethodHandles and MethodHandle calls.
>>
>> Duncan talked about how he was able to improve MH chain size and
>> performance by applying certain transformations in a different order,
>> among other things. InvokeBinder *could* be doing a lot more to
>> optimize the MH chain. For example, the above case never uses the
>> Object value passed in (it is permuted to position 1 and later
>> dropped), but that fact is obscured by the intervening append.
>>
>> InvokeBinder is basically doing with MHs what MHs do with LFs. Perhaps
>> what we really need is a more holistic view of MH + LF operations
>> *together* so we can boil the whole thing down (even across MH lines)
>> before we start interpreting or compiling it?
> The idea of rearranging method handles looks interesting. If JSR292
> framework treated some method handle chains specifically (like having
> custom LambdaForm shape for nested guards), it would be beneficial to
> favor such shapes in the binder.
>
> Best regards,
> Vladimir Ivanov
>
>>
>> - Charlie
>> _______________________________________________
>> mlvm-dev mailing list
>> mlvm-dev at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev
>>
> _______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev