8058779: Faster implementation of String.replace(CharSequence, CharSequence)
Peter Levart
peter.levart at gmail.com
Wed May 27 13:15:26 UTC 2015
Hi Ivan,
How does this implementation compare to your's two regarding speed:
http://cr.openjdk.java.net/~plevart/jdk9-dev/String.replace/webrev.01/
It is not much shorter than your's first, but more object oriented ;-)
Regards, Peter
On 05/27/2015 11:36 AM, Ivan Gerasimov wrote:
> Hi Sherman!
>
> Please take a look at my other webrev, that I provided in the first
> message.
> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/01/webrev/
>
> I used StringJoiner there, which in some aspects seems to fit better
> here, comparing to StringBuilder.
> It does reduce the code, but of course runs slower.
> In that version every part of the source string had to be converted to
> a separate String, which add another redundant copying.
>
> I still prefer the version, which deals with arrays. I don't think
> it's too complex.
> It surely adds some complexity to the original code, but it's only 70
> lines of code in total.
> Everything else are comments and empty lines.
>
> Sincerely yours,
> Ivan
>
> On 27.05.2015 2:38, Xueming Shen wrote:
>> Ivan,
>>
>> It might be worth trying String.index + StringBuilder, instead of
>> writing/handling everything yourself.
>> Yes, it inevitably adds an arraycopy at the end to convert the
>> StrinbBuilder to String, but it might have
>> better balance between performance and code complexity. The regex is
>> probably a little heavy for
>> literal string replacement, but StringBuilder should not be that bad ...
>>
>> -Sherman
>>
>> On 5/26/15 4:11 PM, Ivan Gerasimov wrote:
>>> I updated the webrev:
>>> http://cr.openjdk.java.net/~igerasim/8058779/02/webrev/
>>>
>>> In the check at 2300-2301 and 2351-2352 I replaced MAX_ARRAY_SIZE
>>> with Integer.MAX_VALUE, which seems to be more accurate here.
>>>
>>> And I want to add that this proposed implementation is not only
>>> faster, but also more memory efficient.
>>> The following simple stress-test shows that the proposed version is
>>> able to handle twice larger strings, comparing to the current
>>> implementation.
>>>
>>> ----------------------------------------
>>> public class C {
>>> public static void main(String[] args) throws Throwable {
>>> String s = "string";
>>> for (int i = 1; i < Integer.MAX_VALUE; ++i) {
>>> try {
>>> s = s.replace("string", "stringstring");
>>> } catch (OutOfMemoryError o) {
>>> System.out.println(i + ") " + s.length());
>>> break;
>>> }
>>> }
>>> }
>>> }
>>> ----------------------------------------
>>>
>>> $ time ~/java9/jdk/bin/java -showversion -Xmx1g C
>>> java version "1.9.0-ea"
>>> Java(TM) SE Runtime Environment (build 1.9.0-ea-b63)
>>> Java HotSpot(TM) 64-Bit Server VM (build 1.9.0-ea-b63, mixed mode)
>>>
>>> 25) 100663296
>>>
>>> real 0m4.525s
>>> user 0m4.402s
>>> sys 0m1.189s
>>>
>>> $ time ~/java9/jdk-build/bin/java -showversion -Xmx1g C
>>> java version "1.9.0-internal"
>>> Java(TM) SE Runtime Environment (build
>>> 1.9.0-internal-igerasim_2015_05_23_19_25-b00)
>>> Java HotSpot(TM) 64-Bit Server VM (build
>>> 1.9.0-internal-igerasim_2015_05_23_19_25-b00, mixed mode)
>>>
>>> 26) 201326592
>>>
>>> real 0m2.139s
>>> user 0m1.960s
>>> sys 0m0.461s
>>>
>>> Sincerely yours,
>>> Ivan
>>>
>>> On 24.05.2015 23:17, Ivan Gerasimov wrote:
>>>> Hello everybody!
>>>>
>>>> I know many people here like it when the performance is getting
>>>> better.
>>>> It was suggested to make the literal variant of String.replace()
>>>> faster.
>>>> Currently, this method is implemented as a few calls to regexp API,
>>>> so that the whole implementation takes only two lines of code.
>>>>
>>>> I've created two versions of the fix.
>>>> In the first one, we scan the string and store indices of the found
>>>> substrings in an array.
>>>> Then, we allocate the precisely sized char array and fill it it.
>>>> The case with the empty target has to be handled separately.
>>>>
>>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8058779
>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/00/webrev/
>>>>
>>>> The second variant is much less verbose, however it's less
>>>> efficient too.
>>>> Here the StringJoiner is used as an intermediate storage.
>>>>
>>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8058779/01/webrev/
>>>>
>>>>
>>>> Here are the micro-benchmark results (in a string of ~300 chars do
>>>> ~15 replacements).
>>>> 0) Baseline
>>>> MyBenchmark.test thrpt 40 257'051.948 ± 4537.484 ops/s
>>>>
>>>> 1) Heavy-duty +308%
>>>> MyBenchmark.test thrpt 40 1'049'235.602 ± 15501.803 ops/s
>>>>
>>>> 2) StringJoiner +190%
>>>> MyBenchmark.test thrpt 40 746'000.629 ± 15387.036 ops/s
>>>>
>>>>
>>>> Personally, I like my first variant better, even though it adds
>>>> almost 300 lines of code.
>>>> But I'd like to hear what people think of it.
>>>>
>>>> Sincerely yours,
>>>> Ivan
>>>>
>>>>
>>>
>>
>>
>>
>
More information about the core-libs-dev
mailing list