8058779: Faster implementation of String.replace(CharSequence, CharSequence)

Ivan Gerasimov ivan.gerasimov at oracle.com
Tue May 26 23:11:14 UTC 2015


I updated the webrev:
http://cr.openjdk.java.net/~igerasim/8058779/02/webrev/

In the check at 2300-2301 and 2351-2352 I replaced MAX_ARRAY_SIZE with 
Integer.MAX_VALUE, which seems to be more accurate here.

And  I want to add that this proposed implementation is not only faster, 
but also more memory efficient.
The following simple stress-test shows that the proposed version is able 
to handle twice larger strings, comparing to the current implementation.

----------------------------------------
public class C {
     public static void main(String[] args) throws Throwable {
         String s = "string";
         for (int i = 1; i < Integer.MAX_VALUE; ++i) {
             try {
                 s = s.replace("string", "stringstring");
             } catch (OutOfMemoryError o) {
                 System.out.println(i + ") " + s.length());
                 break;
             }
         }
     }
}
----------------------------------------

$ time ~/java9/jdk/bin/java -showversion -Xmx1g C
java version "1.9.0-ea"
Java(TM) SE Runtime Environment (build 1.9.0-ea-b63)
Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-ea-b63, mixed mode)

25) 100663296

real    0m4.525s
user    0m4.402s
sys    0m1.189s

$ time ~/java9/jdk-build/bin/java -showversion -Xmx1g C
java version "1.9.0-internal"
Java(TM) SE Runtime Environment (build 
1.9.0-internal-igerasim_2015_05_23_19_25-b00)
Java HotSpot(TM) 64-Bit Server VM (build 
1.9.0-internal-igerasim_2015_05_23_19_25-b00, mixed mode)

26) 201326592

real    0m2.139s
user    0m1.960s
sys    0m0.461s

Sincerely yours,
Ivan

On 24.05.2015 23:17, Ivan Gerasimov wrote:
> Hello everybody!
>
> I know many people here like it when the performance is getting better.
> It was suggested to make the literal variant of String.replace() faster.
> Currently, this method is implemented as a few calls to regexp API, so 
> that the whole implementation takes only two lines of code.
>
> I've created two versions of the fix.
> In the first one, we scan the string and store indices of the found 
> substrings in an array.
> Then, we allocate the precisely sized char array and fill it it.
> The case with the empty target has to be handled separately.
>
> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779
> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/00/webrev/
>
> The second variant is much less verbose, however it's less efficient too.
> Here the StringJoiner is used as an intermediate storage.
>
> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/01/webrev/
>
>
> Here are the micro-benchmark results (in a string of ~300 chars do ~15 
> replacements).
> 0) Baseline
> MyBenchmark.test  thrpt   40  257'051.948 ± 4537.484  ops/s
>
> 1) Heavy-duty +308%
> MyBenchmark.test  thrpt   40  1'049'235.602 ± 15501.803  ops/s
>
> 2) StringJoiner +190%
> MyBenchmark.test  thrpt   40  746'000.629 ± 15387.036  ops/s
>
>
> Personally, I like my first variant better, even though it adds almost 
> 300 lines of code.
> But I'd like to hear what people think of it.
>
> Sincerely yours,
> Ivan
>
>




More information about the core-libs-dev mailing list