RFR: 8221542: ~15% performance degradation due to less optimized inline decision
Jie Fu
fujie at loongson.cn
Tue Apr 9 03:51:25 UTC 2019
Hi Vladimir,
Thank you so much for your review and valuable suggestions.
Here is the updated version:
http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/webrev.02/
Please see comments inline and review.
Thanks a lot.
Best regards,
Jie
> What you are proposing is to unconditionally inline constructor calls.
> I consider such change as too intrusive.
>
> I suggest to focus on "profile.count() == 0" check and make it
> smarter: when profile info is scarce, try to prove that the call site
> is actually reachable before giving up.
OK, I agree.
> In addition, it's possible to prove the call is always executed by
> looking at CFG or checking that start block is being parsed.
Very good idea!
I have checked that if the call site belongs to a start block in the
updated patch.
I had tried to look at the CFG like this, but failed.
-----------------------------------------------
diff -r 7383a17b4c65 src/hotspot/share/opto/bytecodeInfo.cpp
--- a/src/hotspot/share/opto/bytecodeInfo.cpp Mon Apr 08 15:27:24 2019
+0800
+++ b/src/hotspot/share/opto/bytecodeInfo.cpp Mon Apr 08 17:25:04 2019
+0800
@@ -373,9 +373,11 @@
} else if (forced_inline()) {
// Inlining was forced by CompilerOracle, ciReplay or annotation
} else if (profile.count() == 0) {
- // don't inline unreached call sites
- set_msg("call site not reached");
- return false;
+ if (C->cfg()->get_block(jvms->bci()) != C->cfg()->get_block(0)) {
+ // don't inline unreached call sites
+ set_msg("call site not reached");
+ return false;
+ }
}
}
-----------------------------------------------
Do you have any comments on how to look at the CFG for more info?
Thanks.
>
> When "profile.count() == 0", but the call site has been reached
> before, it seems the following conditions should hold:
>
> caller_method->interpreter_invocation_count() > 0
I think this condition is redundant since it always holds for a method
to be compiled.
> AND
> caller_method->method_data()->invocation_count() == (0 OR 1)
I'm not sure if this condition still holds for parallel execution of the
caller.
> AND
> callee_method->was_executed_more_than(0) == true
Even though this rule is true, it seems still hard to say that the
particular call site had been reached before.
> AND
> parse->block() == parse->start_block()
>
Very nice!
This rule is good enough to solve this particular issue.
And it has been implemented in the patch.
> Some of them can be turned into asserts (e.g., invocation_count() == 0).
>
> Best regards,
> Vladimir Ivanov
>
> [2]
> (lldb) p this
> (Parse *) $33 = 0x000070000eacc6e8
>
> (lldb) p start_block()
> (Parse::Block *) $31 = 0x00000001008aef00
>
> (lldb) p block()
> (Parse::Block *) $32 = 0x00000001008aef00
Could you please also provide the backtrace info?
I had tried to find the right place to directly call start_block() and
block() in my patch, but failed.
Any more comments?
Thank you very much.
More information about the hotspot-compiler-dev
mailing list