8058779: Faster implementation of String.replace(CharSequence, CharSequence)
Ivan Gerasimov
ivan.gerasimov at oracle.com
Tue May 26 23:11:14 UTC 2015
I updated the webrev:
http://cr.openjdk.java.net/~igerasim/8058779/02/webrev/
In the check at 2300-2301 and 2351-2352 I replaced MAX_ARRAY_SIZE with
Integer.MAX_VALUE, which seems to be more accurate here.
And I want to add that this proposed implementation is not only faster,
but also more memory efficient.
The following simple stress-test shows that the proposed version is able
to handle twice larger strings, comparing to the current implementation.
----------------------------------------
public class C {
public static void main(String[] args) throws Throwable {
String s = "string";
for (int i = 1; i < Integer.MAX_VALUE; ++i) {
try {
s = s.replace("string", "stringstring");
} catch (OutOfMemoryError o) {
System.out.println(i + ") " + s.length());
break;
}
}
}
}
----------------------------------------
$ time ~/java9/jdk/bin/java -showversion -Xmx1g C
java version "1.9.0-ea"
Java(TM) SE Runtime Environment (build 1.9.0-ea-b63)
Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-ea-b63, mixed mode)
25) 100663296
real 0m4.525s
user 0m4.402s
sys 0m1.189s
$ time ~/java9/jdk-build/bin/java -showversion -Xmx1g C
java version "1.9.0-internal"
Java(TM) SE Runtime Environment (build
1.9.0-internal-igerasim_2015_05_23_19_25-b00)
Java HotSpot(TM) 64-Bit Server VM (build
1.9.0-internal-igerasim_2015_05_23_19_25-b00, mixed mode)
26) 201326592
real 0m2.139s
user 0m1.960s
sys 0m0.461s
Sincerely yours,
Ivan
On 24.05.2015 23:17, Ivan Gerasimov wrote:
> Hello everybody!
>
> I know many people here like it when the performance is getting better.
> It was suggested to make the literal variant of String.replace() faster.
> Currently, this method is implemented as a few calls to regexp API, so
> that the whole implementation takes only two lines of code.
>
> I've created two versions of the fix.
> In the first one, we scan the string and store indices of the found
> substrings in an array.
> Then, we allocate the precisely sized char array and fill it it.
> The case with the empty target has to be handled separately.
>
> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779
> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/00/webrev/
>
> The second variant is much less verbose, however it's less efficient too.
> Here the StringJoiner is used as an intermediate storage.
>
> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/01/webrev/
>
>
> Here are the micro-benchmark results (in a string of ~300 chars do ~15
> replacements).
> 0) Baseline
> MyBenchmark.test thrpt 40 257'051.948 ± 4537.484 ops/s
>
> 1) Heavy-duty +308%
> MyBenchmark.test thrpt 40 1'049'235.602 ± 15501.803 ops/s
>
> 2) StringJoiner +190%
> MyBenchmark.test thrpt 40 746'000.629 ± 15387.036 ops/s
>
>
> Personally, I like my first variant better, even though it adds almost
> 300 lines of code.
> But I'd like to hear what people think of it.
>
> Sincerely yours,
> Ivan
>
>
More information about the core-libs-dev
mailing list