[NEW BUG][JDK12] Avoid subList overhead in String.split and Pattern.split

Roger Riggs Roger.Riggs at oracle.com
Tue Oct 23 19:26:27 UTC 2018


Hi Christoph,

Can you take another look at the code just ahead of those instances.

Suppose when limit == 0, the empty elements were removed from the tail 
of list.
It would seem then that the matchList would always be exactly the size.
Can you compare the performance?
Adding a check for an optimization usually is a bit slower in every
non-optimized case and that's worth watching out for.

Given the overhead of proposing and reviewing a change, will this be 
worth it?

$.02, Roger


On 10/20/2018 01:43 PM, Christoph Dreis wrote:
> Hi,
>
> while looking through the code of String.split() and Pattern.split(), I
> wondered if we could get rid of the calls to list.subList(0, resultSize) if
> resultSize matches the size of the list. If that's the case, subList() seems
> redundant.
>
> JMH Benchmarks in single-shot mode show the following results with a new
> revised version (attached below):
>
> Benchmark                                                            Mode
> Cnt      Score       Error   Units
> MyBenchmark.testNew                                       ss   10  12957,900
> ±  7800,863   ns/op
> MyBenchmark.testNew:·gc.alloc.rate               ss   10      6,248 ±
> 3,231  MB/sec
> MyBenchmark.testNew:·gc.alloc.rate.norm    ss   10   1384,000 ±     0,001
> B/op
> MyBenchmark.testNew:·gc.count                     ss   10        ? 0
> counts
> MyBenchmark.testOld                                        ss   10
> 25017,800 ± 19178,411   ns/op
> MyBenchmark.testOld:·gc.alloc.rate                ss   10      4,887 ±
> 3,764  MB/sec
> MyBenchmark.testOld:·gc.alloc.rate.norm     ss   10   1464,000 ±     0,001
> B/op
> MyBenchmark.testOld:·gc.count                      ss   10        ? 0
> counts
>
> Other benchmark modes show inconclusive results from the throughput
> perspective, yet it's clearly visible that less objects are allocated.
>
> Could anybody verify if that's a worthwhile improvement? And in case it is,
> potentially sponsor it?
>
> Thanks in advance.
> Cheers,
> Christoph
>
>
> ===== PATCH =====
> diff -r 5e894b0f5e63 src/java.base/share/classes/java/lang/String.java
> --- a/src/java.base/share/classes/java/lang/String.java Fri Oct 19 16:29:45
> 2018 -0700
> +++ b/src/java.base/share/classes/java/lang/String.java Sat Oct 20 18:49:02
> 2018 +0200
> @@ -2315,6 +2315,9 @@
>                   }
>               }
>               String[] result = new String[resultSize];
> +            if (list.size() == resultSize) {
> +                return list.toArray(result);
> +            }
>               return list.subList(0, resultSize).toArray(result);
>           }
>           return Pattern.compile(regex).split(this, limit);
> diff -r 5e894b0f5e63
> src/java.base/share/classes/java/util/regex/Pattern.java
> --- a/src/java.base/share/classes/java/util/regex/Pattern.java  Fri Oct 19
> 16:29:45 2018 -0700
> +++ b/src/java.base/share/classes/java/util/regex/Pattern.java  Sat Oct 20
> 18:49:02 2018 +0200
> @@ -1293,6 +1293,9 @@
>               while (resultSize > 0 &&
> matchList.get(resultSize-1).equals(""))
>                   resultSize--;
>           String[] result = new String[resultSize];
> +        if (matchList.size() == resultSize) {
> +            return matchList.toArray(result);
> +        }
>           return matchList.subList(0, resultSize).toArray(result);
>       }
>



More information about the core-libs-dev mailing list