RFR: 8156071: List.of: reduce array copying during creation
Paul Sandoz
psandoz at openjdk.java.net
Fri Oct 2 20:41:45 UTC 2020
On Thu, 1 Oct 2020 00:13:28 GMT, Stuart Marks <smarks at openjdk.org> wrote:
> Plumb new internal static factory method to trust the array passed in, avoiding unnecessary copying. JMH results for
> the benchmark show about 15% improvement for the cases that were optimized, namely the 3 to 10 fixed arg cases.
> # VM options: -verbose:gc -XX:+UseParallelGC -Xms4g -Xmx4g --enable-preview -verbose:gc -XX:+UsePara
> llelGC -Xms4g -Xmx4g -Xint
> # Warmup: 5 iterations, 1 s each
> # Measurement: 5 iterations, 2 s each
>
> WITHOUT varargs optimization:
>
> Benchmark Mode Cnt Score Error Units
> ListArgs.list00 thrpt 15 6019.539 ± 144.040 ops/ms
> ListArgs.list01 thrpt 15 1985.009 ± 40.606 ops/ms
> ListArgs.list02 thrpt 15 1854.812 ± 17.488 ops/ms
> ListArgs.list03 thrpt 15 963.866 ± 10.262 ops/ms
> ListArgs.list04 thrpt 15 908.116 ± 6.278 ops/ms
> ListArgs.list05 thrpt 15 848.607 ± 16.701 ops/ms
> ListArgs.list06 thrpt 15 822.282 ± 8.905 ops/ms
> ListArgs.list07 thrpt 15 780.057 ± 11.214 ops/ms
> ListArgs.list08 thrpt 15 745.295 ± 19.204 ops/ms
> ListArgs.list09 thrpt 15 704.596 ± 14.003 ops/ms
> ListArgs.list10 thrpt 15 696.436 ± 4.914 ops/ms
> ListArgs.list11 thrpt 15 661.908 ± 11.041 ops/ms
>
> WITH varargs optimization:
>
> Benchmark Mode Cnt Score Error Units
> ListArgs.list00 thrpt 15 6172.298 ± 62.736 ops/ms
> ListArgs.list01 thrpt 15 1987.724 ± 45.468 ops/ms
> ListArgs.list02 thrpt 15 1843.419 ± 10.693 ops/ms
> ListArgs.list03 thrpt 15 1126.946 ± 30.952 ops/ms
> ListArgs.list04 thrpt 15 1050.440 ± 17.859 ops/ms
> ListArgs.list05 thrpt 15 999.275 ± 23.656 ops/ms
> ListArgs.list06 thrpt 15 948.844 ± 19.615 ops/ms
> ListArgs.list07 thrpt 15 897.541 ± 15.531 ops/ms
> ListArgs.list08 thrpt 15 853.359 ± 18.755 ops/ms
> ListArgs.list09 thrpt 15 826.394 ± 8.284 ops/ms
> ListArgs.list10 thrpt 15 779.231 ± 4.104 ops/ms
> ListArgs.list11 thrpt 15 650.888 ± 3.948 ops/ms
Looks good, i wondered why the performance results were so slow then i looked more closely and saw "-Xint" was used. I
usually don't ascribe much value to micro benchmarks run in interpreter only mode, but hey any shaving off startup time
is welcome. Less allocation is definitely welcome (although i do wish C2 was better at eliding redundant array
initialization and allocation).
-------------
Marked as reviewed by psandoz (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/449
More information about the security-dev
mailing list