Reflection vs MethodHandle performance in Oracle JDK 7u7

Mon Oct 15 20:06:56 PDT 2012

I am new to Openjdk and mlvm, but I have decades of experience in benchmark cheating and hackery.
(No, that is not what I am bringing to OpenJDK.)

I went to look at Caliper, and in the tutorial, I saw this:

      return dummy; // framework ignores this, but it has served its purpose!

This is not at all confidence-inspiring; inlining of the call, or even conditional inlining,
allows the deadness of dummy to be detected, and then your benchmark is screwed.
The framework should print dummy, to a real live file, not /dev/null (yes, once upon a 
time a workstation vendor spotted the /dev/null case and short-circuited all of libcurses
out of the way).  I wouldn't trust reflective access to be sufficiently obfuscating; if I used
reflection and it could be optimized into a direct call, I would like that, so I should not
be surprised if an optimizer picked up that transformation and it accidentally messed
up this benchmark.

Sorry if I seem skeptical, but I've seen intelligent people write terrible benchmarks.
Consider the old JavaGrande Fork-Join benchmark:

// do something trivial but which won't be optimised away! 
double theta=37.2, sint, res; 
sint = Math.sin(theta);
res = sint*sint; 
//defeat dead code elimination 
if(res <= 0) System.out.println(
     "Benchmark exited with unrealistic res value " + res);

Strictfp, Math.sin of theta is -0.478645918588415 -- that's straight from the spec, if you simply
follow the recipe.  Even with widefp, the optimizer knows the target FP, and constant propagate
there, too.  I think you can figure out the rest.

Sorry if I seem skeptical, but you've got to be very careful with microbenchmarks.
They say they're careful, but I looked at their examples, and by my standards, they're not careful enough.

David

On 2012-10-15, at 9:12 PM, Ashwin Jayaprakash <ashwin.jayaprakash at gmail.com> wrote:

> People seem to be skeptical about the micro benchmarks I posted. This is why I used Caliper (http://code.google.com/p/caliper/).
> 
> Caliper runs each test multiple times (indicated by "trials"). Each trial itself runs the code in a loop with "reps", which loops for 10s of millions of times. So, I'm uploading the log files for your verification. Look for statements like "running trial with 138668464 reps" in the files. Caliper does run the test enough times to let the JIT warm up.
> 
> JDK 8:
>  0% Scenario{vm=java, trial=0, benchmark=Reflect, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 19.88 ns; ?=0.90 ns @ 10 trials
> 10% Scenario{vm=java, trial=0, benchmark=Handle, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 16.06 ns; ?=0.09 ns @ 3 trials
> 20% Scenario{vm=java, trial=0, benchmark=Direct, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 7.41 ns; ?=0.01 ns @ 3 trials
> 30% Scenario{vm=java, trial=0, benchmark=Iface, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 7.27 ns; ?=0.06 ns @ 3 trials
> 40% Scenario{vm=java, trial=0, benchmark=Static, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 7.33 ns; ?=0.01 ns @ 3 trials
> 50% Scenario{vm=java, trial=0, benchmark=Reflect, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 18.55 ns; ?=1.88 ns @ 10 trials
> 60% Scenario{vm=java, trial=0, benchmark=Handle, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 15.13 ns; ?=0.06 ns @ 3 trials
> 70% Scenario{vm=java, trial=0, benchmark=Direct, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 7.21 ns; ?=0.07 ns @ 4 trials
> 80% Scenario{vm=java, trial=0, benchmark=Iface, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 7.23 ns; ?=0.07 ns @ 9 trials
> 90% Scenario{vm=java, trial=0, benchmark=Static, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 7.20 ns; ?=0.02 ns @ 3 trials
> 
> benchmark                   tier    ns linear runtime
>   Reflect -XX:+TieredCompilation 19.88 ==============================
>   Reflect -XX:-TieredCompilation 18.55 ===========================
>    Handle -XX:+TieredCompilation 16.06 ========================
>    Handle -XX:-TieredCompilation 15.13 ======================
>    Direct -XX:+TieredCompilation  7.41 ===========
>    Direct -XX:-TieredCompilation  7.21 ==========
>     Iface -XX:+TieredCompilation  7.27 ==========
>     Iface -XX:-TieredCompilation  7.23 ==========
>    Static -XX:+TieredCompilation  7.33 ===========
>    Static -XX:-TieredCompilation  7.20 ==========
> 
> vm: java
> trial: 0
> tune: -server -Xmx96M -Xmx96M
> 
> Writing results to C:\temp\jdk_8_ea_b59.log
> 
> 
> JDK 7:
>  0% Scenario{vm=java, trial=0, benchmark=Reflect, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 16.40 ns; ?=0.16 ns @ 7 trials
> 10% Scenario{vm=java, trial=0, benchmark=Handle, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 20.89 ns; ?=0.64 ns @ 10 trials
> 20% Scenario{vm=java, trial=0, benchmark=Direct, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 4.81 ns; ?=0.04 ns @ 3 trials
> 30% Scenario{vm=java, trial=0, benchmark=Iface, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 4.86 ns; ?=0.05 ns @ 3 trials
> 40% Scenario{vm=java, trial=0, benchmark=Static, tune=-server -Xmx96M -Xmx96M, tier=-XX:+TieredCompilation} 4.84 ns; ?=0.04 ns @ 3 trials
> 50% Scenario{vm=java, trial=0, benchmark=Reflect, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 16.55 ns; ?=0.15 ns @ 4 trials
> 60% Scenario{vm=java, trial=0, benchmark=Handle, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 20.96 ns; ?=0.59 ns @ 10 trials
> 70% Scenario{vm=java, trial=0, benchmark=Direct, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 4.79 ns; ?=0.01 ns @ 3 trials
> 80% Scenario{vm=java, trial=0, benchmark=Iface, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 4.80 ns; ?=0.03 ns @ 3 trials
> 90% Scenario{vm=java, trial=0, benchmark=Static, tune=-server -Xmx96M -Xmx96M, tier=-XX:-TieredCompilation} 4.85 ns; ?=0.05 ns @ 7 trials
> 
> benchmark                   tier    ns linear runtime
>   Reflect -XX:+TieredCompilation 16.40 =======================
>   Reflect -XX:-TieredCompilation 16.55 =======================
>    Handle -XX:+TieredCompilation 20.89 =============================
>    Handle -XX:-TieredCompilation 20.96 ==============================
>    Direct -XX:+TieredCompilation  4.81 ======
>    Direct -XX:-TieredCompilation  4.79 ======
>     Iface -XX:+TieredCompilation  4.86 ======
>     Iface -XX:-TieredCompilation  4.80 ======
>    Static -XX:+TieredCompilation  4.84 ======
>    Static -XX:-TieredCompilation  4.85 ======
> 
> vm: java
> trial: 0
> tune: -server -Xmx96M -Xmx96M
> 
> Writing results to C:\temp\jdk_7u7.log
> 
> 
> Regards,
> Ashwin.
> 
> 
> 
> 
> <jdk_7u7.log><jdk_8_ea_b59.log>_______________________________________________
> mlvm-dev mailing list
> mlvm-dev at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev