Invokedynamic and inlining flags

Mon Jan 31 11:39:09 PST 2011

Hello friends! After months on "vacation" from indy, I managed to
spend some time this weekend updating JRuby's indy support. But this
email is to ask about the inlining flag tweaks that still seem to be
required.

First, the good news: using a fastdebug build of MLVM from Stephen B
(FYI, Stephen, your 1/11 build is fastdebug, not product), running
JRuby with invokedynamic enabled is better than 30% faster than simple
inline caching (measured on the same JVM with indy turned off). That's
excellent! This marks the first time there's been a clear improvement
over our usual execution mode! You're finally beating* JRuby by a
comfortable margin! :)

The bad news is that I still had to tweak inlining flags way up.
Specifically, MaxInlineSize=150 and InlineSmallCode=2000 or higher
(stopped improving somewhere around 10000).

Obviously we can't force people to set these flags when running JRuby
on Java 7, so I'm writing to get confirmation that this is still going
to get worked out. I'd not be surprised at all to hear things are
still being tuned and tweaked, and so adjusting inlining budgets at
this point would be premature. Just tell me so :)

Also, the bar has moved for beating JRuby performance when comparing
to our "dynopt" mode. Dynopt uses the most recently called method at
each interpreted call site to insert a guard plus direct static-typed
call in the emitted JRuby bytecode. That still performs around 2x
invokedynamic. HOWEVER...it's not really a fair test yet, since dynopt
also avoids using Fixnum objects in more cases and also inserts a
direct call for recursion. Those two combined with EA still being
active could make all the difference.

I'll continue widening the use of indy and report back.

- Charlie