[PATCH] regex matcher opt: remove redundant StringBuilder
Isaac Levy
isaac.r.levy at gmail.com
Wed Jul 18 14:20:49 UTC 2018
I still think it would be valuable to land a patch for replaceAll to
avoid temporary StringBuilders, is there anyone who wants to help me
land it?
Isaac
On Sun, Apr 29, 2018 at 10:24 PM, Isaac Levy <isaac.r.levy at gmail.com> wrote:
> Your patch looks good to me, and will get the majority of performance
> benefits without any API issues.
>
> Isaac
>
>
> On Tue, Apr 24, 2018 at 9:38 PM, Xueming Shen <xueming.shen at oracle.com> wrote:
>> for String based replaceAll/First() it might be worth doing something like
>>
>> http://cr.openjdk.java.net/~sherman/regex_removesb/webrev/
>>
>>
>> On 4/24/18, 10:47 AM, Isaac Levy wrote:
>>>
>>> Hi Sherman,
>>>
>>> Thanks for clarifying. Looks like exceptions are caused by invalid
>>> format string. I wouldn't expect most programs to be catching this and
>>> preserving their buffer, but dunno.
>>>
>>> How much does it affect perf? Well it depends on use case, a jmh of
>>> replaceAll with a length 200 string of digits and regex "(\w)" shows
>>> about 30% speedup.
>>>
>>> [info] Benchmark Mode Cnt Score Error Units
>>> [info] origM avgt 10 11.669 ± 0.211 us/op
>>> [info] newM avgt 10 8.926 ± 0.095 us/op
>>>
>>> Isaac
>>>
>>>
>>> On Tue, Apr 24, 2018 at 12:53 PM, Xueming Shen<xueming.shen at oracle.com>
>>> wrote:
>>>>
>>>> Hi Isaac,
>>>>
>>>> I actually meant to say "we are not supposed to output the partial text
>>>> into
>>>> the output buffer in case of an exception". It has nothing to do with the
>>>> changeset you cited. This has been the behavior since day one/JDK1.4,
>>>> though it is not specified explicitly in the API doc. The newly added
>>>> StringBuilder variant simply follows this behavior. If it's really
>>>> desired
>>>> it
>>>> is kinda doable to save that StringBuilder without the "incompatible"
>>>> behavior
>>>> change but just wonder if it is really worth the effort.
>>>>
>>>> Thanks,
>>>> Sherman
>>>>
>>>>
>>>> On 4/24/18, 9:11 AM, Isaac Levy wrote:
>>>>>
>>>>> (moving this to a separate discussion)
>>>>>
>>>>>
>>>>> --- a/src/java.base/share/classes/java/util/regex/Matcher.java
>>>>> +++ b/src/java.base/share/classes/java/util/regex/Matcher.java
>>>>> @@ -993,13 +993,11 @@
>>>>> public Matcher appendReplacement(StringBuilder sb, String
>>>>> replacement) {
>>>>> // If no match, return error
>>>>> if (first< 0)
>>>>> throw new IllegalStateException("No match available");
>>>>> - StringBuilder result = new StringBuilder();
>>>>> - appendExpandedReplacement(replacement, result);
>>>>> // Append the intervening text
>>>>> sb.append(text, lastAppendPosition, first);
>>>>> // Append the match substitution
>>>>> + appendExpandedReplacement(replacement, sb);
>>>>> - sb.append(result);
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Apr 23, 2018 at 5:05 PM Xueming Shen<xueming.shen at oracle.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> I would assume in case of an exception thrown from
>>>>>> appendExpandedReplacement() we don't
>>>>>> want "text" to be pushed into the "sb".
>>>>>>
>>>>>> -sherman
>>>>>
>>>>>
>>>>> Perhaps. Though the behavior under exception is undefined and this
>>>>> function is probably primarily used though the replaceAll API, which
>>>>> wouldn’t return the intermediate sb under the case of exception.
>>>>>
>>>>> My reading of the blame was the temp StringBuilder was an artifact of
>>>>> the api previously using StringBuffer externally. See
>>>>> http://hg.openjdk.java.net/jdk9/dev/jdk/rev/763c564451b3
>>>>
>>>>
>>
More information about the core-libs-dev
mailing list