8234863: Increase default value of MaxInlineLevel
Jiva, Azeem
javajiva at amazon.com
Thu Feb 20 15:43:57 UTC 2020
Claes,
I've been running the RxJava microbenchmarks[1] with JDK8 and JDK11 with -XX:MaxInlineLevel=15 and there are a number of the microbenchmarks, specifically the ComputationSchedulerPerf:observeOn microbenchmark that regresses in performance. In the case of this specific benchmark, 30% regression. I've been looking into why, but no definitive answers as of yet.
[1]: https://github.com/ReactiveX/RxJava
--
Azeem Jiva
On 2/19/20, 3:27 AM, "hotspot-compiler-dev on behalf of Claes Redestad" <hotspot-compiler-dev-bounces at openjdk.java.net on behalf of claes.redestad at oracle.com> wrote:
Hi Volker,
On 2020-02-19 10:25, Volker Simonis wrote:
> Hi Claes,
>
> we've been experimenting with the increased MaxInlineLevel and seen
> performance improvements on a variety of benchmarks. Unfortunately
> we've also seen some considerable performance regressions on single
> benchmarks. While we still have to analyze these outliers in more
> detail, one assumption is that they may be caused because we are
> running into the MaxNodeLimit earlier now.
we are still at zero observed regressions in our benchmarking efforts
due this change - with a number of very clear improvements in a variety
of cases.
>
> Have you seen such effects during your experiments and have you
> experimented with increasing MaxNodeLimit along with MaxInlineLevel?
No and mnyes. I've done a lot of experiments with various heuristics,
and MaxInlineLevel alone has been the only one with a very clear signal
to noise ratio.
On a theoretical level it's unsurprising that deeper inlining level can
mean some benchmarks hit the node limit earlier or with a different
"timing", and that they would have been helped by an increase in that
limit. On a practical level we must have data to support changes in
the heuristics, and so far this is the first I hear of a considerable
regression.
>
> I just saw that we already increase MaxNodeLimit by a factor of 3 if
> we encounter an invokedynamic bytecode [1] and Shenandoah also
> increases it globally by a factor of 3 [3]. So this means that when
> running with Shenandoah and compiling a method with invokedynamic
> we'll already use nine times the default of MaxNodeLimit (which is
> 80000).
Interesting.
Is the single benchmark you're seeing regressions on something you might
be able to share more details about? Does the hot methods encounter any
indy bytecode? Does the regression appear with any GC?
I can't speak for Shenandoah since we don't build and run with it,
but IIUC they've removed some barriers lately and maybe no longer need
to increase the limit all that much these days.
/Claes
More information about the hotspot-compiler-dev
mailing list