8058779: Faster implementation of String.replace(CharSequence, CharSequence)

Xueming Shen xueming.shen at oracle.com
Tue May 26 23:38:00 UTC 2015


Ivan,

It might be worth trying String.index + StringBuilder, instead of 
writing/handling everything yourself.
Yes, it inevitably adds an arraycopy at the end to convert the 
StrinbBuilder to String, but it might have
better balance between performance and code complexity. The regex is 
probably a little heavy for
literal string replacement, but StringBuilder should not be that bad ...

-Sherman

On 5/26/15 4:11 PM, Ivan Gerasimov wrote:
> I updated the webrev:
> http://cr.openjdk.java.net/~igerasim/8058779/02/webrev/
>
> In the check at 2300-2301 and 2351-2352 I replaced MAX_ARRAY_SIZE with 
> Integer.MAX_VALUE, which seems to be more accurate here.
>
> And  I want to add that this proposed implementation is not only 
> faster, but also more memory efficient.
> The following simple stress-test shows that the proposed version is 
> able to handle twice larger strings, comparing to the current 
> implementation.
>
> ----------------------------------------
> public class C {
>     public static void main(String[] args) throws Throwable {
>         String s = "string";
>         for (int i = 1; i < Integer.MAX_VALUE; ++i) {
>             try {
>                 s = s.replace("string", "stringstring");
>             } catch (OutOfMemoryError o) {
>                 System.out.println(i + ") " + s.length());
>                 break;
>             }
>         }
>     }
> }
> ----------------------------------------
>
> $ time ~/java9/jdk/bin/java -showversion -Xmx1g C
> java version "1.9.0-ea"
> Java(TM) SE Runtime Environment (build 1.9.0-ea-b63)
> Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-ea-b63, mixed mode)
>
> 25) 100663296
>
> real    0m4.525s
> user    0m4.402s
> sys    0m1.189s
>
> $ time ~/java9/jdk-build/bin/java -showversion -Xmx1g C
> java version "1.9.0-internal"
> Java(TM) SE Runtime Environment (build 
> 1.9.0-internal-igerasim_2015_05_23_19_25-b00)
> Java HotSpot(TM) 64-Bit Server VM (build 
> 1.9.0-internal-igerasim_2015_05_23_19_25-b00, mixed mode)
>
> 26) 201326592
>
> real    0m2.139s
> user    0m1.960s
> sys    0m0.461s
>
> Sincerely yours,
> Ivan
>
> On 24.05.2015 23:17, Ivan Gerasimov wrote:
>> Hello everybody!
>>
>> I know many people here like it when the performance is getting better.
>> It was suggested to make the literal variant of String.replace() faster.
>> Currently, this method is implemented as a few calls to regexp API, 
>> so that the whole implementation takes only two lines of code.
>>
>> I've created two versions of the fix.
>> In the first one, we scan the string and store indices of the found 
>> substrings in an array.
>> Then, we allocate the precisely sized char array and fill it it.
>> The case with the empty target has to be handled separately.
>>
>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779
>> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/00/webrev/
>>
>> The second variant is much less verbose, however it's less efficient 
>> too.
>> Here the StringJoiner is used as an intermediate storage.
>>
>> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/01/webrev/
>>
>>
>> Here are the micro-benchmark results (in a string of ~300 chars do 
>> ~15 replacements).
>> 0) Baseline
>> MyBenchmark.test  thrpt   40  257'051.948 ± 4537.484  ops/s
>>
>> 1) Heavy-duty +308%
>> MyBenchmark.test  thrpt   40  1'049'235.602 ± 15501.803  ops/s
>>
>> 2) StringJoiner +190%
>> MyBenchmark.test  thrpt   40  746'000.629 ± 15387.036  ops/s
>>
>>
>> Personally, I like my first variant better, even though it adds 
>> almost 300 lines of code.
>> But I'd like to hear what people think of it.
>>
>> Sincerely yours,
>> Ivan
>>
>>
>




More information about the core-libs-dev mailing list