More performance explorations
Charles Oliver Nutter
headius at headius.com
Sun Jun 5 10:21:41 PDT 2011
Here's a little ray of sunshine to temper all my grousing about
performance: a benchmark that is comfortably faster than non-indy, and
shows the down side of dynopt (specifically, that it slows down
nontrivial benchmarks, presumably due to excessive bytecode size):
bench_evanphx_goruco.rb is a benchmark created by Evan Phoenix of the
Rubinius project, another Ruby implementation for which Evan has built
an optimizing mixed-mode JIT and generational GC. Rubinius currently
does a better job of optimizing Ruby code (than JRuby) due largely to
its ability to inline Ruby code (where stock JRuby never does).
invokedynamic has and (probably) will continue to help us match or
exceed Rubinius's raw Ruby execution performance.
This benchmark is mostly an object-creation bench, to stress
allocation and GC. In JRuby, however, the overhead of dispatch comes
through in many places, and invokedynamic appears to help a good
amount over stock CachingCallSite dispatch (and it's considerably
better than dynopt):
INDY:
~/projects/jruby ➔ jruby bench/bench_evanphx_goruco.rb
11.300000 0.000000 11.300000 ( 11.249000)
10.026000 0.000000 10.026000 ( 10.026000)
10.184000 0.000000 10.184000 ( 10.184000)
10.907000 0.000000 10.907000 ( 10.906000)
10.379000 0.000000 10.379000 ( 10.378000)
NON-INDY:
~/projects/jruby ➔ jruby -Xcompile.invokedynamic=false
bench/bench_evanphx_goruco.rb
12.500000 0.000000 12.500000 ( 12.448000)
11.454000 0.000000 11.454000 ( 11.454000)
11.910000 0.000000 11.910000 ( 11.909000)
11.305000 0.000000 11.305000 ( 11.305000)
11.331000 0.000000 11.331000 ( 11.331000)
DYNOPT:
~/projects/jruby ➔ jruby -Xcompile.invokedynamic=false
-Xcompile.dynopt=true bench/bench_evanphx_goruco.rb
12.982000 0.000000 12.982000 ( 12.887000)
12.363000 0.000000 12.363000 ( 12.363000)
12.431000 0.000000 12.431000 ( 12.431000)
12.490000 0.000000 12.490000 ( 12.490000)
12.344000 0.000000 12.344000 ( 12.344000)
Nice results, and I know there's tons of improvements possible for the
hot paths in this benchmark (both in JRuby and in Hotspot).
Incidentally, my MLVM build is a good 25-30% faster than Java 6, even
without invokedynamic use. Kudos to the entire Hotspot team for making
every release almost inexplicably "just faster" than previous
versions.
- Charlie
More information about the mlvm-dev
mailing list