[NEW BUG][JDK12] Avoid subList overhead in String.split and Pattern.split
Roger Riggs
Roger.Riggs at oracle.com
Tue Oct 23 19:26:27 UTC 2018
Hi Christoph,
Can you take another look at the code just ahead of those instances.
Suppose when limit == 0, the empty elements were removed from the tail
of list.
It would seem then that the matchList would always be exactly the size.
Can you compare the performance?
Adding a check for an optimization usually is a bit slower in every
non-optimized case and that's worth watching out for.
Given the overhead of proposing and reviewing a change, will this be
worth it?
$.02, Roger
On 10/20/2018 01:43 PM, Christoph Dreis wrote:
> Hi,
>
> while looking through the code of String.split() and Pattern.split(), I
> wondered if we could get rid of the calls to list.subList(0, resultSize) if
> resultSize matches the size of the list. If that's the case, subList() seems
> redundant.
>
> JMH Benchmarks in single-shot mode show the following results with a new
> revised version (attached below):
>
> Benchmark Mode
> Cnt Score Error Units
> MyBenchmark.testNew ss 10 12957,900
> ± 7800,863 ns/op
> MyBenchmark.testNew:·gc.alloc.rate ss 10 6,248 ±
> 3,231 MB/sec
> MyBenchmark.testNew:·gc.alloc.rate.norm ss 10 1384,000 ± 0,001
> B/op
> MyBenchmark.testNew:·gc.count ss 10 ? 0
> counts
> MyBenchmark.testOld ss 10
> 25017,800 ± 19178,411 ns/op
> MyBenchmark.testOld:·gc.alloc.rate ss 10 4,887 ±
> 3,764 MB/sec
> MyBenchmark.testOld:·gc.alloc.rate.norm ss 10 1464,000 ± 0,001
> B/op
> MyBenchmark.testOld:·gc.count ss 10 ? 0
> counts
>
> Other benchmark modes show inconclusive results from the throughput
> perspective, yet it's clearly visible that less objects are allocated.
>
> Could anybody verify if that's a worthwhile improvement? And in case it is,
> potentially sponsor it?
>
> Thanks in advance.
> Cheers,
> Christoph
>
>
> ===== PATCH =====
> diff -r 5e894b0f5e63 src/java.base/share/classes/java/lang/String.java
> --- a/src/java.base/share/classes/java/lang/String.java Fri Oct 19 16:29:45
> 2018 -0700
> +++ b/src/java.base/share/classes/java/lang/String.java Sat Oct 20 18:49:02
> 2018 +0200
> @@ -2315,6 +2315,9 @@
> }
> }
> String[] result = new String[resultSize];
> + if (list.size() == resultSize) {
> + return list.toArray(result);
> + }
> return list.subList(0, resultSize).toArray(result);
> }
> return Pattern.compile(regex).split(this, limit);
> diff -r 5e894b0f5e63
> src/java.base/share/classes/java/util/regex/Pattern.java
> --- a/src/java.base/share/classes/java/util/regex/Pattern.java Fri Oct 19
> 16:29:45 2018 -0700
> +++ b/src/java.base/share/classes/java/util/regex/Pattern.java Sat Oct 20
> 18:49:02 2018 +0200
> @@ -1293,6 +1293,9 @@
> while (resultSize > 0 &&
> matchList.get(resultSize-1).equals(""))
> resultSize--;
> String[] result = new String[resultSize];
> + if (matchList.size() == resultSize) {
> + return matchList.toArray(result);
> + }
> return matchList.subList(0, resultSize).toArray(result);
> }
>
More information about the core-libs-dev
mailing list