RFR: 8155600 - Performance optimization of Arrays.asList().iterator()

Thu Apr 28 08:29:47 UTC 2016

Hi Tagir,

I think this is a nice optimization. I like the part that allows EA to 
optimize-away the Arrays.ArrayList allocation when used in foreach loop.

Regards, Peter

On 04/28/2016 09:37 AM, Tagir F. Valeev wrote:
> Hello!
>
> Please review and sponsor the following patch:
> https://bugs.openjdk.java.net/browse/JDK-8155600
> http://cr.openjdk.java.net/~tvaleev/webrev/8155600/r1/
>
> It appears that custom implementation of Arrays.asList().iterator() is
> noticeably more performant than the AbstractList implementation used
> currently. Here's JMH test which illustrates this:
>
> http://cr.openjdk.java.net/~tvaleev/webrev/8155600/jmh/
>
>      @Benchmark
>      public int sumList() {
>          return sum(list);
>      }
>
>      @Benchmark
>      public int sumArray() {
>          return sum(Arrays.asList(arr));
>      }
>
>      public static int sum(Iterable<Integer> data) {
>          int sum = 0;
>          for (int item : data) sum+=item;
>          return sum;
>      }
>
> Here's summary result on my Intel x64 machine:
>
> JDK 9ea+115:
> Benchmark           (limit)  Mode  Cnt     Score    Error  Units
> ArrayTest.sumArray        0  avgt   50     6,371 ±  0,009  ns/op
> ArrayTest.sumArray        1  avgt   50     8,020 ±  0,100  ns/op
> ArrayTest.sumArray       10  avgt   50    14,424 ±  0,060  ns/op
> ArrayTest.sumArray      100  avgt   50    88,684 ±  3,261  ns/op
> ArrayTest.sumArray     1000  avgt   50   677,048 ±  1,050  ns/op
> ArrayTest.sumArray    10000  avgt   50  8038,224 ± 59,678  ns/op
> ArrayTest.sumList         0  avgt   50     3,395 ±  0,003  ns/op
> ArrayTest.sumList         1  avgt   50     5,613 ±  0,061  ns/op
> ArrayTest.sumList        10  avgt   50    13,448 ±  0,076  ns/op
> ArrayTest.sumList       100  avgt   50    97,242 ±  0,105  ns/op
> ArrayTest.sumList      1000  avgt   50   664,342 ±  0,704  ns/op
> ArrayTest.sumList     10000  avgt   50  7971,230 ± 61,344  ns/op
>
> JDK 9ea+115 patched:
> Benchmark           (limit)  Mode  Cnt     Score    Error  Units
> ArrayTest.sumArray        0  avgt   50     2,932 ±  0,011  ns/op
> ArrayTest.sumArray        1  avgt   50     5,000 ±  0,062  ns/op
> ArrayTest.sumArray       10  avgt   50     9,421 ±  0,219  ns/op
> ArrayTest.sumArray      100  avgt   50    43,848 ±  0,187  ns/op
> ArrayTest.sumArray     1000  avgt   50   383,619 ±  1,019  ns/op
> ArrayTest.sumArray    10000  avgt   50  6860,087 ± 13,022  ns/op
> ArrayTest.sumList         0  avgt   50     3,400 ±  0,008  ns/op
> ArrayTest.sumList         1  avgt   50     6,142 ±  0,047  ns/op
> ArrayTest.sumList        10  avgt   50    12,488 ±  0,084  ns/op
> ArrayTest.sumList       100  avgt   50    83,029 ±  0,078  ns/op
> ArrayTest.sumList      1000  avgt   50   605,651 ±  0,992  ns/op
> ArrayTest.sumList     10000  avgt   50  7649,681 ± 16,355  ns/op
>
> AbstractList.iterator() makes unnecessary bookkeeping like modCount
> tracking which reduces the performance. Also it seems that having
> implicit this$0 field in AbstractList.iterator() makes impossible to
> avoid allocation of short-lived Arrays.ArrayList object in sumArray()
> test. Launching JMH -prof gc confirms that non-patched sumArray() test
> allocates 24 bytes per invocation while patched sumArray() does not
> allocate anything.
>
> What do you think? Is it reasonable change? Should I check some tests
> or add new ones for this optimization?
>
> With best regards,
> Tagir Valeev.
>