[NEW BUG][JDK12] Avoid subList overhead in String.split and Pattern.split

Christoph Dreis christoph.dreis at freenet.de
Sat Oct 20 17:43:57 UTC 2018


Hi,

while looking through the code of String.split() and Pattern.split(), I
wondered if we could get rid of the calls to list.subList(0, resultSize) if
resultSize matches the size of the list. If that's the case, subList() seems
redundant.

JMH Benchmarks in single-shot mode show the following results with a new
revised version (attached below):

Benchmark                                                            Mode
Cnt      Score       Error   Units
MyBenchmark.testNew                                       ss   10  12957,900
±  7800,863   ns/op
MyBenchmark.testNew:·gc.alloc.rate               ss   10      6,248 ±
3,231  MB/sec
MyBenchmark.testNew:·gc.alloc.rate.norm    ss   10   1384,000 ±     0,001
B/op
MyBenchmark.testNew:·gc.count                     ss   10        ? 0
counts
MyBenchmark.testOld                                        ss   10
25017,800 ± 19178,411   ns/op
MyBenchmark.testOld:·gc.alloc.rate                ss   10      4,887 ±
3,764  MB/sec
MyBenchmark.testOld:·gc.alloc.rate.norm     ss   10   1464,000 ±     0,001
B/op
MyBenchmark.testOld:·gc.count                      ss   10        ? 0
counts

Other benchmark modes show inconclusive results from the throughput
perspective, yet it's clearly visible that less objects are allocated.

Could anybody verify if that's a worthwhile improvement? And in case it is,
potentially sponsor it?

Thanks in advance.
Cheers,
Christoph


===== PATCH =====
diff -r 5e894b0f5e63 src/java.base/share/classes/java/lang/String.java
--- a/src/java.base/share/classes/java/lang/String.java Fri Oct 19 16:29:45
2018 -0700
+++ b/src/java.base/share/classes/java/lang/String.java Sat Oct 20 18:49:02
2018 +0200
@@ -2315,6 +2315,9 @@
                 }
             }
             String[] result = new String[resultSize];
+            if (list.size() == resultSize) {
+                return list.toArray(result);
+            }
             return list.subList(0, resultSize).toArray(result);
         }
         return Pattern.compile(regex).split(this, limit);
diff -r 5e894b0f5e63
src/java.base/share/classes/java/util/regex/Pattern.java
--- a/src/java.base/share/classes/java/util/regex/Pattern.java  Fri Oct 19
16:29:45 2018 -0700
+++ b/src/java.base/share/classes/java/util/regex/Pattern.java  Sat Oct 20
18:49:02 2018 +0200
@@ -1293,6 +1293,9 @@
             while (resultSize > 0 &&
matchList.get(resultSize-1).equals(""))
                 resultSize--;
         String[] result = new String[resultSize];
+        if (matchList.size() == resultSize) {
+            return matchList.toArray(result);
+        }
         return matchList.subList(0, resultSize).toArray(result);
     } 



More information about the core-libs-dev mailing list