C2: Advantage of parse time inlining

Vitaly Davidovich vitalyd at gmail.com
Thu May 14 19:02:20 UTC 2015


Thanks Vladimir.  I recall seeing changes around incremental inlining, and
may have mistakenly thought it happens at some later point in time.
Appreciate the clarification.

Ok, so based on what you say, I can see a theoretical problem whereby a
method is being parsed, is larger than MaxInlineSize, but doesn't happen to
be frequent enough yet at this point, and so it won't be inlined; if it
turns out to be hot later on, the lack of inlining will not be undone
(assuming the caller isn't deopted and recompiled later, with updated
frequency info, for other reasons).

Thanks again.

On Thu, May 14, 2015 at 2:30 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com
> wrote:

> Vitaly,
>
> You have small misconception - almost all C2 inlining (normal java
> methods) is done during parsing in one pass. Recently we changed it to
> inline jsr292 methods after parsing (to execute IGVN and reduce graph -
> otherwise they blow up number of ideal nodes and we bailout compilation due
> to MaxNodeLimit).
>
> As parser goes and see a call site it check can it inline or not (see
> opto/bytecodeinfo.cpp, should_inline() and should_not_inline()). There are
> several conditions which drive inlining and following (most important)
> flags controls them: MaxTrivialSize, MaxInlineSize, FreqInlineSize,
> InlineSmallCode, MinInliningThreshold, MaxInlineLevel,
> InlineFrequencyCount, InlineFrequencyRatio.
>
> Most tedious flag is InlineSmallCode (size of compiled assembler code
> which is different from other sizes which are bytecode size) which control
> inlining of already compiled method. Usually if a call site is hot the
> callee is compiled before caller. But sometimes you can get caller compiled
> first (if it has hot loop, for example) so different condition will be used
> and as result you can get performance variation between runs.
>
> The difference between MaxInlineSize (35) and FreqInlineSize (325) is
> FreqInlineSize takes into account how frequent call site is executed
> relatively to caller invocations:
>
>   int call_site_count  = method()->scale_count(profile.count());
>   int invoke_count     = method()->interpreter_invocation_count();
>   int freq = call_site_count / invoke_count;
>   int max_inline_size  = MaxInlineSize;
>   // bump the max size if the call is frequent
>   if ((freq >= InlineFrequencyRatio) ||
>       (call_site_count >= InlineFrequencyCount) ||
>       is_unboxing_method(callee_method, C) ||
>       is_init_with_ea(callee_method, caller_method, C)) {
>     max_inline_size = FreqInlineSize;
>
> And there is additional inlining condition for all methods which size >
> MaxTrivialSize:
>
>   if (!callee_method->was_executed_more_than(MinInliningThreshold)) {
>     set_msg("executed < MinInliningThreshold times");
>
> Regards,
> Vladimir
>
> On 5/14/15 10:03 AM, Vitaly Davidovich wrote:
>
>> I should also add that I see how inlining without taking call freq into
>> account could lead to faster time to peak performance for methods that
>> eventually get hot anyway but aren't at parse time.  Peak perf will be
>> the same if the method is too big for parse inlining but eventually gets
>> compiled due to reaching hotness.  Is that about right?
>>
>> sent from my phone
>>
>> On May 14, 2015 12:57 PM, "Vitaly Davidovich" <vitalyd at gmail.com
>> <mailto:vitalyd at gmail.com>> wrote:
>>
>>     Vladimir,
>>
>>     I'm comparing MaxInlineSize (35) with FreqInlineSize (325).  AFAIU,
>>     MaxInlineSize drives which methods are inlined at parse time by C2,
>>     whereas FreqInlineSize is the threshold for "late" (or what do you
>>     guys call inlining after parsing?) inlining.  Most of the inlining
>>     discussions (or worries, rather) seem to focus around the
>>     MaxInlineSize value, and not FreqInlineSize, even if the target
>>     method will get hot.
>>
>>         Usually, people care about 35 (= MaxInlineSize), because for
>>         methods up to MaxInlineSize their call frequency is ignored. So,
>>         fewer chances to end up with non-inlined call.
>>
>>
>>     Ok, so for hot methods then MaxInlineSize isn't really a concern,
>>     and FreqInlineSize would be the threshold to worry about (for C2
>>     compiler) then? Why are people worried about inlining in cold paths
>>     then?
>>
>>     Thanks Vladimir
>>
>>     On Thu, May 14, 2015 at 12:36 PM, Vladimir Ivanov
>>     <vladimir.x.ivanov at oracle.com <mailto:vladimir.x.ivanov at oracle.com>>
>>     wrote:
>>
>>         Vitaly,
>>
>>         Can you elaborate your question a bit? What do you compare
>>         parse-time inlining with? Mentioning of С1 & profile pollution
>>         in this context confuses me.
>>
>>         Usually, people care about 35 (= MaxInlineSize), because for
>>         methods up to MaxInlineSize their call frequency is ignored. So,
>>         fewer chances to end up with non-inlined call.
>>
>>         Best regards,
>>         Vladimir Ivanov
>>
>>         On 5/14/15 7:09 PM, Vitaly Davidovich wrote:
>>
>>             Any pointers? Sorry to bug you guys, but just want to make
>>             sure I
>>             understand this point as I see quite a bit of discussion on
>>             core-libs
>>             and elsewhere where people are worrying about the 35
>>             bytecode size
>>             threshold for parse inlining.
>>
>>             On Wed, May 13, 2015 at 3:36 PM, Vitaly Davidovich
>>             <vitalyd at gmail.com <mailto:vitalyd at gmail.com>
>>             <mailto:vitalyd at gmail.com <mailto:vitalyd at gmail.com>>> wrote:
>>
>>                  Hi guys,
>>
>>                  Could someone please explain the advantage, if any, of
>>             parse time
>>                  inlining in C2? Given that FreqInlineSize is quite
>>             large by default,
>>                  most hot methods will get inlined anyway (well, ones
>>             that can be for
>>                  other reasons).  What is the advantage of parse time
>>             inlining?
>>
>>                  Is it quicker time to peak performance if C1 is reached
>>             first?
>>
>>                  Does it ensure that a method is inlined whereas it may
>>             not be if
>>                  it's already compiled into a medium/large method
>> otherwise?
>>
>>                  Is parse time inlining not susceptible to profile
>>             pollution? I
>>                  suspect it is since the interpreter has already
>>             profiled the inlinee
>>                  either way, but wanted to check.
>>
>>                  Anything else I'm not thinking about?
>>
>>                  Thanks
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150514/70f404bb/attachment.html>


More information about the hotspot-compiler-dev mailing list