8234863: Increase default value of MaxInlineLevel

Thu Feb 20 15:43:57 UTC 2020

Claes,
  I've been running the RxJava microbenchmarks[1] with JDK8 and JDK11 with -XX:MaxInlineLevel=15 and there are a number of the microbenchmarks, specifically the ComputationSchedulerPerf:observeOn microbenchmark that regresses in performance.  In the case of this specific benchmark, 30% regression.  I've been looking into why, but no definitive answers as of yet.

[1]:  https://github.com/ReactiveX/RxJava

-- 
Azeem Jiva 

On 2/19/20, 3:27 AM, "hotspot-compiler-dev on behalf of Claes Redestad" <hotspot-compiler-dev-bounces at openjdk.java.net on behalf of claes.redestad at oracle.com> wrote:

    Hi Volker,

    On 2020-02-19 10:25, Volker Simonis wrote:
    > Hi Claes,
    > 
    > we've been experimenting with the increased MaxInlineLevel and seen
    > performance improvements  on a variety of benchmarks. Unfortunately
    > we've also seen some considerable performance regressions on single
    > benchmarks. While we still have to analyze these outliers in more
    > detail, one assumption is that they may be caused because we are
    > running into the MaxNodeLimit earlier now.

    we are still at zero observed regressions in our benchmarking efforts
    due this change - with a number of very clear improvements in a variety
    of cases.

    > 
    > Have you seen such effects during your experiments and have you
    > experimented with increasing MaxNodeLimit along with MaxInlineLevel?

    No and mnyes. I've done a lot of experiments with various heuristics, 
    and MaxInlineLevel alone has been the only one with a very clear signal
    to noise ratio.

    On a theoretical level it's unsurprising that deeper inlining level can
    mean some benchmarks hit the node limit earlier or with a different
    "timing", and that they would have been helped by an increase in that
    limit. On a practical level we must have data to support changes in
    the heuristics, and so far this is the first I hear of a considerable
    regression.

    > 
    > I just saw that we already increase MaxNodeLimit by a factor of 3 if
    > we encounter an invokedynamic bytecode [1] and Shenandoah also
    > increases it globally by a factor of 3 [3]. So this means that when
    > running with Shenandoah and compiling a method with invokedynamic
    > we'll already use nine times the default of MaxNodeLimit (which is
    > 80000).

    Interesting.

    Is the single benchmark you're seeing regressions on something you might
    be able to share more details about? Does the hot methods encounter any
    indy bytecode? Does the regression appear with any GC?

    I can't speak for Shenandoah since we don't build and run with it,
    but IIUC they've removed some barriers lately and maybe no longer need
    to increase the limit all that much these days.

    /Claes