RFR(S): 8235673: [C1, C2] Split inlining control flags

Doerr, Martin martin.doerr at sap.com
Thu May 7 13:42:05 UTC 2020


Hi Tobias,

thanks for looking at my change.
It is only expected to influence startup, not peak performance.

Nevertheless, I've run benchmarks to check peak performance as well: SPEC jvm 2008, SPEC jbb 2015
No regressions observable, as expected.

For startup performance, I've ran SPEC jbb 2005 with throughput measurements every 1.5 seconds like I had shown in my fosdem talk (https://fosdem.org/2020/schedule/event/jit2020/).
I couldn't observe any regression, either.

It would be very helpful if other people (e.g. from Oracle) could run additional benchmarks. I don't know what you use to check startup performance.

> Did you verify that tests using these flags are still working as expected (i.e.,
> intend to only adjust C2's behavior)?
Using the existing flags still works for C2. So there's no issue with C2 tests.
I'm not aware of any test which requires one of these flags to modify C1 behavior.

I've run a substantial amount of tests and couldn't find any related issues:
jtreg, jck (normal, with -Xcomp, with -Xcomp -XX:-TieredCompilation), SAP's proprietary tests

Best regards,
Martin


> -----Original Message-----
> From: Tobias Hartmann <tobias.hartmann at oracle.com>
> Sent: Donnerstag, 7. Mai 2020 11:28
> To: Doerr, Martin <martin.doerr at sap.com>; Nils Eliasson
> <nils.eliasson at oracle.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags
> 
> Hi Martin,
> 
> looks good to me too but I'm a bit concerned about C1InlineStackLimit
> affecting performance. What
> benchmarks did you run?
> 
> Did you verify that tests using these flags are still working as expected (i.e.,
> intend to only
> adjust C2's behavior)?
> 
> Thanks,
> Tobias
> 
> On 04.05.20 18:04, Doerr, Martin wrote:
> > Hi Nils,
> >
> > thank you for looking at this and sorry for the late reply.
> >
> > I've added MaxTrivialSize and also updated the issue accordingly. Makes
> sense.
> > Do you have more flags in mind?
> >
> > Moving the flags which are only used by C2 into c2_globals definitely makes
> sense.
> >
> > Done in webrev.01:
> > http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.01/
> >
> > Please take a look and let me know when my proposal is ready for a CSR.
> >
> > Best regards,
> > Martin
> >
> >
> >> -----Original Message-----
> >> From: hotspot-compiler-dev <hotspot-compiler-dev-
> >> bounces at openjdk.java.net> On Behalf Of Nils Eliasson
> >> Sent: Dienstag, 28. April 2020 18:29
> >> To: hotspot-compiler-dev at openjdk.java.net
> >> Subject: Re: RFR(S): 8235673: [C1, C2] Split inlining control flags
> >>
> >> Hi,
> >>
> >> Thanks for addressing this! This has been an annoyance for a long time.
> >>
> >> Have you though about including other flags - like MaxTrivialSize?
> >> MaxInlineSize is tested against it.
> >>
> >> Also - you should move the flags that are now c2-only to c2_globals.hpp.
> >>
> >> Best regards,
> >> Nils Eliasson
> >>
> >> On 2020-04-27 15:06, Doerr, Martin wrote:
> >>> Hi,
> >>>
> >>> while tuning inlining parameters for C2 compiler with JDK-8234863 we
> had
> >> discussed impact on C1.
> >>> I still think it's bad to share them between both compilers. We may want
> to
> >> do further C2 tuning without negative impact on C1 in the future.
> >>>
> >>> C1 has issues with substantial inlining because of the lack of uncommon
> >> traps. When C1 inlines a lot, stack frames may get large and code cache
> space
> >> may get wasted for cold or even never executed code. The situation gets
> >> worse when many patching stubs get used for such code.
> >>>
> >>> I had opened the following issue:
> >>> https://bugs.openjdk.java.net/browse/JDK-8235673
> >>>
> >>> And my initial proposal is here:
> >>> http://cr.openjdk.java.net/~mdoerr/8235673_C1_inlining/webrev.00/
> >>>
> >>>
> >>> Part of my proposal is to add an additional flag which I called
> >> C1InlineStackLimit to reduce stack utilization for C1 methods.
> >>> I have a simple example which shows wasted stack space (java example
> >> TestStack at the end).
> >>>
> >>> It simply counts stack frames until a stack overflow occurs. With the
> current
> >> implementation, only 1283 frames fit on the stack because the never
> >> executed method bogus_test with local variables gets inlined.
> >>> Reduced C1InlineStackLimit avoids inlining of bogus_test and we get
> 2310
> >> frames until stack overflow. (I only used C1 for this example. Can be
> >> reproduced as shown below.)
> >>>
> >>> I didn't notice any performance regression even with the aggressive
> setting
> >> of C1InlineStackLimit=5 with TieredCompilation.
> >>>
> >>> I know that I'll need a CSR for this change, but I'd like to get feedback in
> >> general and feedback about the flag names before creating a CSR.
> >>> I'd also be glad about feedback regarding the performance impact.
> >>>
> >>> Best regards,
> >>> Martin
> >>>
> >>>
> >>>
> >>> Command line:
> >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=20 -
> >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining -
> >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow
> >> TestStack
> >>> CompileCommand: compileonly TestStack.triggerStackOverflow
> >>>                                @ 8   TestStack::triggerStackOverflow (15 bytes)
> recursive
> >> inlining too deep
> >>>                                @ 11   TestStack::bogus_test (33 bytes)   inline
> >>> caught java.lang.StackOverflowError
> >>> 1283 activations were on stack, sum = 0
> >>>
> >>> jdk/bin/java -XX:TieredStopAtLevel=1 -XX:C1InlineStackLimit=10 -
> >> XX:C1MaxRecursiveInlineLevel=0 -Xss256k -Xbatch -XX:+PrintInlining -
> >> XX:CompileCommand=compileonly,TestStack::triggerStackOverflow
> >> TestStack
> >>> CompileCommand: compileonly TestStack.triggerStackOverflow
> >>>                                @ 8   TestStack::triggerStackOverflow (15 bytes)
> recursive
> >> inlining too deep
> >>>                                @ 11   TestStack::bogus_test (33 bytes)   callee uses too
> >> much stack
> >>> caught java.lang.StackOverflowError
> >>> 2310 activations were on stack, sum = 0
> >>>
> >>>
> >>> TestStack.java:
> >>> public class TestStack {
> >>>
> >>>      static long cnt = 0,
> >>>                  sum = 0;
> >>>
> >>>      public static void bogus_test() {
> >>>          long c1 = 1, c2 = 2, c3 = 3, c4 = 4;
> >>>          sum += c1 + c2 + c3 + c4;
> >>>      }
> >>>
> >>>      public static void triggerStackOverflow() {
> >>>          cnt++;
> >>>          triggerStackOverflow();
> >>>          bogus_test();
> >>>      }
> >>>
> >>>
> >>>      public static void main(String args[]) {
> >>>          try {
> >>>              triggerStackOverflow();
> >>>          } catch (StackOverflowError e) {
> >>>              System.out.println("caught " + e);
> >>>          }
> >>>          System.out.println(cnt + " activations were on stack, sum = " + sum);
> >>>      }
> >>> }
> >>>
> >


More information about the hotspot-compiler-dev mailing list