RFR: 8221542: ~15% performance degradation due to less optimized inline decision
Jie Fu
fujie at loongson.cn
Thu Apr 18 09:54:20 UTC 2019
Hi Vladimir,
> Though I don't consider parallel execution case as problematic,
> I got a better idea while browsing the code :-)
>
> http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.01
Aha! I've found a way to show you that the following condition in
patch[1] does NOT hold with the parallel execution of the caller.
-----------------------------------------------
if (caller_method->was_executed_more_than(1)) return false; // trust
profile
-----------------------------------------------
Step 1: Apply this patch
-----------------------------------------------
diff -r 5de35f58f70c src/hotspot/share/opto/bytecodeInfo.cpp
--- a/src/hotspot/share/opto/bytecodeInfo.cpp Thu Apr 18 02:45:02 2019
+0200
+++ b/src/hotspot/share/opto/bytecodeInfo.cpp Thu Apr 18 17:32:16 2019
+0800
@@ -374,6 +374,8 @@
// Inlining was forced by CompilerOracle, ciReplay or annotation
} else if (profile.count() == 0) {
// don't inline unreached call sites
+ tty->print_cr("caller_method count = %d,
was_executed_more_than(1) is %s",
+ caller_method->interpreter_invocation_count(),
caller_method->was_executed_more_than(1) ? "true" : "false");
set_msg("call site not reached");
return false;
}
-----------------------------------------------
Step 2: Run SPECjvm2008's scimark.monte_carlo with the reproduce
script[2] on a machine with high parallelism.
Step 3: Just wait and see the result.
For example, I run it on an i7-8700 machine with just 12 threads.
Here is the result showing that profile.count is 0 &&
caller_method->was_executed_more_than(1) is true.
-----------------------------------------------
Benchmark: scimark.monte_carlo
Run mode: timed run
Test type: multi
Threads: 12
Warmup: 120s
Iterations: 1
Run length: 240s
275 72 java.lang.StringBuilder::append (8 bytes)
made not entrant
275 99 java.io.File::<init> (47 bytes) made not entrant
Warmup (120s) begins: Thu Apr 18 17:25:33 CST 2019
281 113 s spec.benchmarks.scimark.utils.Random::nextDouble (124
bytes)
282 114 %
spec.benchmarks.scimark.monte_carlo.MonteCarlo::integrate @ 15 (68 bytes)
s @ 22
spec.benchmarks.scimark.utils.Random::nextDouble (124 bytes) inline (hot)
s @ 28
spec.benchmarks.scimark.utils.Random::nextDouble (124 bytes) inline (hot)
432 114 %
spec.benchmarks.scimark.monte_carlo.MonteCarlo::integrate @ 15 (68
bytes) made not entrant
433 115 spec.benchmarks.scimark.monte_carlo.MonteCarlo::integrate
(68 bytes)
caller_method count = 13, was_executed_more_than(1) is true
@ 6
spec.benchmarks.scimark.utils.Random::<init> (53 bytes) call site not
reached
s @ 22
spec.benchmarks.scimark.utils.Random::nextDouble (124 bytes) inline (hot)
s @ 28
spec.benchmarks.scimark.utils.Random::nextDouble (124 bytes) inline (hot)
436 116 %
spec.benchmarks.scimark.monte_carlo.MonteCarlo::integrate @ 15 (68 bytes)
s @ 22
spec.benchmarks.scimark.utils.Random::nextDouble (124 bytes) inline (hot)
s @ 28
spec.benchmarks.scimark.utils.Random::nextDouble (124 bytes) inline (hot)
-----------------------------------------------
So do you agree to remove that condition in your patch[1]?
Thanks a lot.
Best regards,
Jie
[1] http://cr.openjdk.java.net/~vlivanov/jiefu/8221542/webrev.00/
[2] http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/reproduce.sh
More information about the hotspot-compiler-dev
mailing list