RFR: 8221542: ~15% performance degradation due to less optimized inline decision

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Thu Apr 11 01:03:05 UTC 2019


> http://cr.openjdk.java.net/~jiefu/monte_carlo-perf-drop/webrev.02/

Much better.

One more question: what happens if there's an exception being thrown 
earlier in the block (e.g., NPE or from a call) and it makes the call 
site unreachable? Inspecting relevant counters I mentioned before may 
help detecting such case.

Best regards,
Vladimir Ivanov

>> What you are proposing is to unconditionally inline constructor calls. 
>> I consider such change as too intrusive.
>>
>> I suggest to focus on "profile.count() == 0" check and make it 
>> smarter: when profile info is scarce, try to prove that the call site 
>> is actually reachable before giving up.
> OK, I agree.
>> In addition, it's possible to prove the call is always executed by 
>> looking at CFG or checking that start block is being parsed.
> Very good idea!
> I have checked that if the call site belongs to a start block in the 
> updated patch.
> I had tried to look at the CFG like this, but failed.
> -----------------------------------------------
> diff -r 7383a17b4c65 src/hotspot/share/opto/bytecodeInfo.cpp
> --- a/src/hotspot/share/opto/bytecodeInfo.cpp   Mon Apr 08 15:27:24 2019 
> +0800
> +++ b/src/hotspot/share/opto/bytecodeInfo.cpp   Mon Apr 08 17:25:04 2019 
> +0800
> @@ -373,9 +373,11 @@
>       } else if (forced_inline()) {
>         // Inlining was forced by CompilerOracle, ciReplay or annotation
>       } else if (profile.count() == 0) {
> -      // don't inline unreached call sites
> -       set_msg("call site not reached");
> -       return false;
> +      if (C->cfg()->get_block(jvms->bci()) != C->cfg()->get_block(0)) {
> +        // don't inline unreached call sites
> +        set_msg("call site not reached");
> +        return false;
> +      }
>       }
>     }
> -----------------------------------------------
> Do you have any comments on how to look at the CFG for more info?
> Thanks.
>>
>> When "profile.count() == 0", but the call site has been reached 
>> before, it seems the following conditions should hold:
>>
>>   caller_method->interpreter_invocation_count() > 0
> I think this condition is redundant since it always holds for a method 
> to be compiled.
>> AND
>>   caller_method->method_data()->invocation_count() == (0 OR 1)
> I'm not sure if this condition still holds for parallel execution of the 
> caller.
>> AND
>>  callee_method->was_executed_more_than(0) == true
> Even though this rule is true, it seems still hard to say that the 
> particular call site had been reached before.
>> AND
>>  parse->block() == parse->start_block()
>>
> Very nice!
> This rule is good enough to solve this particular issue.
> And it has been implemented in the patch.
>> Some of them can be turned into asserts (e.g., invocation_count() == 0).
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [2]
>> (lldb) p this
>> (Parse *) $33 = 0x000070000eacc6e8
>>
>> (lldb) p start_block()
>> (Parse::Block *) $31 = 0x00000001008aef00
>>
>> (lldb) p block()
>> (Parse::Block *) $32 = 0x00000001008aef00
> Could you please also provide the backtrace info?
> I had tried to find the right place to directly call start_block() and 
> block() in my patch, but failed.


More information about the hotspot-compiler-dev mailing list