StringBuilder version of java.util.regex.Matcher.append*

Jeremy Manson jeremymanson at google.com
Tue Apr 8 17:47:15 UTC 2014


This looks fine to me, Sherman.  Thanks for the hard work!

Jeremy


On Mon, Apr 7, 2014 at 10:00 AM, Xueming Shen <xueming.shen at oracle.com>wrote:

>  On 04/04/2014 10:08 AM, Xueming Shen wrote:
>
> On 4/3/14 4:43 PM, Jeremy Manson wrote:
>
> Good catch, thanks.
>
>  I think we should probably just go with the (equivalent to the)
> StringBuffer variant.  I'm pretty loathe to modify the StringBuilder
> directly if we are going to back that change out.
>
>  Do you want me to generate a new patch?
>
>
> I can/will send out an updated webrev before push.
>
>
> the latest webrev.
>
> http://cr.openjdk.java.net/~sherman/8039124/webrev
>
> -Sherman
>
>
>
> -Sherman
>
>
>  Jeremy
>
>
> On Thu, Apr 3, 2014 at 2:27 PM, Xueming Shen <xueming.shen at oracle.com>wrote:
>
>>  On 03/25/2014 02:07 PM, Jeremy Manson wrote:
>>
>> Okay.  Thanks, Sherman.  Here's an updated version.
>>
>>  I've diverged a bit from Peter's version.  In this version,
>> appendExpandedReplacement takes a StringBuilder.  The implications is that
>> In the StringBuilder case, it saves creating a new StringBuilder object.
>>  In the StringBuffer case, it creates a new StringBuilder, but it doesn't
>> have to acquire and release all of those locks.
>>
>>
>>  Hi Jeremy,
>>
>> It appears the "optimized" StringBuilder version will cause following
>> test case failure,
>> in which the "xyz" will be copied into the result buffer, even when the
>> replacement
>> string triggers a IAE.
>>
>>         // Check nothing has been appended into the output buffer if
>>         // the replacement string triggers IllegalArgumentException.
>>         Pattern p = Pattern.compile("(abc)");
>>         Matcher m = p.matcher("abcd");
>>         StringBuilder result = new StringBuilder();
>>         try {
>>             m.appendReplacement(result, ("xyz$g"));
>>         } catch (IllegalArgumentException iae) {
>>             if (result.length() != 0)
>>                 System.err.println(" FAILED");
>>
>>         }
>>
>> We may have to either catch the IAE and reset the sb, or create
>> a new sb, as the StringBuffer does.
>>
>> -Sherman
>>
>>
>>
>>
>>  I also noticed a redundant cast to (int), which I removed.
>>
>>  Jeremy
>>
>>
>> On Fri, Mar 21, 2014 at 7:34 PM, Xueming Shen <xueming.shen at oracle.com>wrote:
>>
>>  let's add the StringBuilder method(s), if you can provide an updated
>> version, I can run the rest (since it's
>> to add new api, there is an internal CCC process need to go through).
>>
>> -Sherman
>>
>>
>> On 3/21/14 5:18 PM, Jeremy Manson wrote:
>>
>> So, this is all a little opaque to me.  How do we make the go/no-go
>> decision on something like this?  Everyone who has chimed in seems to think
>> it is a good idea.
>>
>>  Jeremy
>>
>>
>> On Thu, Mar 20, 2014 at 10:38 AM, Jeremy Manson <jeremymanson at google.com>wrote:
>>
>> Sherman,
>>
>>  If you had released it then (which you wouldn't have been able to do,
>> because you would have to wait another two years for Java 7), you would
>> have found that it improved performance even with C2.  It is only
>> post-escape-analysis that the performance in C2 equalized.
>>
>>  Anyway, I think adding the StringBuilder variant and deferring /
>> dealing with the Appendable differently is the right approach, FWIW.
>>
>>  Jeremy
>>
>>
>> On Thu, Mar 20, 2014 at 10:25 AM, Xueming Shen <xueming.shen at oracle.com>wrote:
>>
>> 2009? I do have something similar back to 2009 :-)
>>
>> http://cr.openjdk.java.net/~sherman/regex_replace/webrev/
>>
>> Then the ball was dropped around the discussion of whether or not
>> the IOE should be thrown.
>>
>> But if we are going to/have to have explicit StringBuilder/Buffer pair
>> anyway, then we can keep the Appendable version as private for now
>> and deal with the StringBuilder and Appendable as two separate
>> issues.
>>
>> -Sherman
>>
>>
>> On 03/20/2014 09:52 AM, Jeremy Manson wrote:
>>
>> That's definitely an improvement - I think that when I wrote this (circa
>> 2009), I didn't think about Appendable.
>>
>> I take it my argument convinced someone?  :)
>>
>> Jeremy
>>
>>
>> On Thu, Mar 20, 2014 at 1:32 AM, Peter Levart<peter.levart at gmail.com
>> >wrote:
>>
>>  On 03/19/2014 06:51 PM, Jeremy Manson wrote:
>>
>>  I'm told that the diff didn't make it.  I've put it in a Google drive
>> folder...
>>
>> https://drive.google.com/file/d/0B_GaXa6O4K5LY3Y0aHpranM3aEU/
>> edit?usp=sharing
>>
>> Jeremy
>>
>>  Hi Jeremy,
>>
>> Your factoring-out of expandReplacement() method exposed an opportunity to
>> further optimize the code. Instead of creating intermediate StringBuilder
>> instance for each expandReplacement() call, this method could append
>> directly to resulting StringBuffer/StringBuilder, like in the following:
>>
>> http://cr.openjdk.java.net/~plevart/jdk9-dev/MatcherWithStringBuilder/
>> webrev.01/
>>
>>
>> Regards, Peter
>>
>>
>>
>>  On Wed, Mar 19, 2014 at 10:41 AM, Jeremy Manson<jeremymanson at google.com>
>> wrote:
>>
>>   Hi folks,
>>
>> We've had this internally for a while, and I keep meaning to bring it up
>> here.  The Matcher class has a few public methods that take
>> StringBuffers,
>> and we've found it useful to add similar versions that take
>> StringBuilders.
>>
>> It has two benefits:
>>
>> - Users don't have to convert from one to the other when they want to use
>> the method in question.  The symmetry is nice.
>>
>> - The StringBuilder variants are faster (if lock optimizations don't kick
>> in, which happens in the interpreter and the client compiler).  For
>> interpreted / client-compiled code, we saw something like a 25% speedup
>> on
>> String.replaceAll(), which calls into this code.
>>
>> Any interest?  Diff attached.
>>
>> Jeremy
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>



More information about the core-libs-dev mailing list