RFR 8222955 : Optimize String.replace(CharSequence, CharSequence) for Latin1 encoded strings

Mon May 6 19:42:06 UTC 2019

Hello everyone!

I'm seeking a (capital R) Reviewer to review/approve the fix, so I could 
go ahead and push it.

The latest webrev contains an additional testcase for OOM suggested by 
Tagir.

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8222955
WEBREV: http://cr.openjdk.java.net/~igerasim/8222955/02/webrev/

Thanks in advance!

Ivan

On 4/30/19 10:20 AM, Ivan Gerasimov wrote:
> Hello Tagir!
>
>
> On 4/30/19 8:57 AM, Tagir Valeev wrote:
>> Hello!
>>
>> New webrev looks good. Probably it worth adding a testcase for
>> overflow (assert that OOME is thrown)? I guess, something like
>> "1".repeat(65536).replace("1", "1".repeat(65536)) would produce OOME
>> quite fast and without allocating much memory.
>
> Good point!  Will do.
>
> With kind regards,
> Ivan
>
>> With best regards,
>> Tagir Valeev.
>>
>> On Tue, Apr 30, 2019 at 11:43 AM Ivan Gerasimov
>> <ivan.gerasimov at oracle.com> wrote:
>>> Hello everyone!
>>>
>>> Please help review the second version of the enhancement.
>>> A separate branch was added to handle UTF16 in the searched string 
>>> and/or in the target and replacement.
>>>
>>> Switching to Math.xxxExact() suggested by Tagir gave ~4% of 
>>> throughput on affected test cases.
>>>
>>> Also, allocating uninitialized array added ~7% for Latin1 strings.
>>> I used StringConcatHelper.newArray() to avoid bringing Unsafe into 
>>> StringLatin1.
>>> In the StringLatin1.replace(), the newly allocated array is 
>>> guaranteed to be filled up, and the filling code should never throw, 
>>> so I believe using uninitialized arrays here is justified.
>>>
>>> Here's the updated webrev:
>>> http://cr.openjdk.java.net/~igerasim/8222955/01/webrev/
>>>
>>> Below please find the benchmark numbers (with an additional column, 
>>> showing the improvement.)
>>>
>>> Surprisingly, in one test case a slowdown was observed:  Only for 
>>> Latin1 string, with 1-char target, when the target string was *not* 
>>> found.
>>> If I understood it correctly, this is because prior the fix the 
>>> intrinsified StringLatin1.indexOf(byte[], byte[]) was called on the 
>>> first place, so there were effectively a fast path for exactly this 
>>> pattern.
>>>
>>> I'm not quite sure what to do about it.
>>> Maybe this is fine as it is, since the *average* improvement across 
>>> different test cases is still good?
>>>
>>>
>>>   - prior fix:
>>> Benchmark                          Mode  Cnt    Score   Error Units
>>> StringReplace.replace0x1_1_Latin1  avgt   15    7.116 ± 0.060 ns/op
>>> StringReplace.replace0x1_1_UTF16   avgt   15   84.255 ± 3.381 ns/op
>>> StringReplace.replace1x1_0_Latin1  avgt   15   63.254 ± 0.973 ns/op
>>> StringReplace.replace1x1_0_UTF16   avgt   15   89.941 ± 3.210 ns/op
>>> StringReplace.replace1x1_1_Latin1  avgt   15   75.662 ± 0.344 ns/op
>>> StringReplace.replace1x1_1_UTF16   avgt   15   81.454 ± 1.986 ns/op
>>> StringReplace.replace1x1_2_Latin1  avgt   15   89.492 ± 1.889 ns/op
>>> StringReplace.replace1x1_2_UTF16   avgt   15   87.430 ± 1.341 ns/op
>>> StringReplace.replace2x1_0_Latin1  avgt   15   69.575 ± 0.368 ns/op
>>> StringReplace.replace2x1_0_UTF16   avgt   15  112.201 ± 2.836 ns/op
>>> StringReplace.replace2x1_1_Latin1  avgt   15   83.841 ± 0.940 ns/op
>>> StringReplace.replace2x1_1_UTF16   avgt   15  115.722 ± 2.267 ns/op
>>> StringReplace.replace2x1_2_Latin1  avgt   15   99.266 ± 1.008 ns/op
>>> StringReplace.replace2x1_2_UTF16   avgt   15  132.271 ± 2.365 ns/op
>>>
>>>
>>>   - after fix:
>>> Benchmark                          Mode  Cnt   Score   Error Units
>>> StringReplace.replace0x1_1_Latin1  avgt   15  10.541 ± 0.826 ns/op   
>>> x0.68
>>> StringReplace.replace0x1_1_UTF16   avgt   15  31.473 ± 0.389 ns/op   
>>> x2.68
>>> StringReplace.replace1x1_0_Latin1  avgt   15  50.455 ± 5.038 ns/op   
>>> x1.25
>>> StringReplace.replace1x1_0_UTF16   avgt   15  35.355 ± 0.074 ns/op   
>>> x2.54
>>> StringReplace.replace1x1_1_Latin1  avgt   15  22.868 ± 0.076 ns/op   
>>> x3.31
>>> StringReplace.replace1x1_1_UTF16   avgt   15  22.290 ± 1.913 ns/op   
>>> x3.65
>>> StringReplace.replace1x1_2_Latin1  avgt   15  51.362 ± 0.130 ns/op   
>>> x1.74
>>> StringReplace.replace1x1_2_UTF16   avgt   15  33.086 ± 0.088 ns/op   
>>> x2.64
>>> StringReplace.replace2x1_0_Latin1  avgt   15  57.169 ± 7.165 ns/op   
>>> x1.22
>>> StringReplace.replace2x1_0_UTF16   avgt   15  50.886 ± 1.193 ns/op   
>>> x2.20
>>> StringReplace.replace2x1_1_Latin1  avgt   15  23.320 ± 2.954 ns/op   
>>> x3.60
>>> StringReplace.replace2x1_1_UTF16   avgt   15  24.741 ± 0.229 ns/op   
>>> x4.68
>>> StringReplace.replace2x1_2_Latin1  avgt   15  56.045 ± 0.153 ns/op   
>>> x1.77
>>> StringReplace.replace2x1_2_UTF16   avgt   15  49.795 ± 0.178 ns/op   
>>> x2.66
>>>
>>> -- 
>>> With kind regards,
>>> Ivan Gerasimov
>

-- 
With kind regards,
Ivan Gerasimov