StringBuilder version of java.util.regex.Matcher.append*

Xueming Shen xueming.shen at oracle.com
Fri Apr 4 17:08:18 UTC 2014


On 4/3/14 4:43 PM, Jeremy Manson wrote:
> Good catch, thanks.
>
> I think we should probably just go with the (equivalent to the) 
> StringBuffer variant.  I'm pretty loathe to modify the StringBuilder 
> directly if we are going to back that change out.
>
> Do you want me to generate a new patch?

I can/will send out an updated webrev before push.

-Sherman

>
> Jeremy
>
>
> On Thu, Apr 3, 2014 at 2:27 PM, Xueming Shen <xueming.shen at oracle.com 
> <mailto:xueming.shen at oracle.com>> wrote:
>
>     On 03/25/2014 02:07 PM, Jeremy Manson wrote:
>>     Okay.  Thanks, Sherman.  Here's an updated version.
>>
>>     I've diverged a bit from Peter's version.  In this version,
>>     appendExpandedReplacement takes a StringBuilder.  The
>>     implications is that In the StringBuilder case, it saves creating
>>     a new StringBuilder object.  In the StringBuffer case, it creates
>>     a new StringBuilder, but it doesn't have to acquire and release
>>     all of those locks.
>
>     Hi Jeremy,
>
>     It appears the "optimized" StringBuilder version will cause
>     following test case failure,
>     in which the "xyz" will be copied into the result buffer, even
>     when the replacement
>     string triggers a IAE.
>
>             // Check nothing has been appended into the output buffer if
>             // the replacement string triggers IllegalArgumentException.
>             Pattern p = Pattern.compile("(abc)");
>             Matcher m = p.matcher("abcd");
>             StringBuilder result = new StringBuilder();
>             try {
>                 m.appendReplacement(result, ("xyz$g"));
>             } catch (IllegalArgumentException iae) {
>                 if (result.length() != 0)
>                     System.err.println(" FAILED");
>
>             }
>
>     We may have to either catch the IAE and reset the sb, or create
>     a new sb, as the StringBuffer does.
>
>     -Sherman
>
>
>
>>
>>     I also noticed a redundant cast to (int), which I removed.
>>
>>     Jeremy
>>
>>
>>     On Fri, Mar 21, 2014 at 7:34 PM, Xueming Shen
>>     <xueming.shen at oracle.com <mailto:xueming.shen at oracle.com>> wrote:
>>
>>         let's add the StringBuilder method(s), if you can provide an
>>         updated version, I can run the rest (since it's
>>         to add new api, there is an internal CCC process need to go
>>         through).
>>
>>         -Sherman
>>
>>
>>         On 3/21/14 5:18 PM, Jeremy Manson wrote:
>>>         So, this is all a little opaque to me.  How do we make the
>>>         go/no-go decision on something like this?  Everyone who has
>>>         chimed in seems to think it is a good idea.
>>>
>>>         Jeremy
>>>
>>>
>>>         On Thu, Mar 20, 2014 at 10:38 AM, Jeremy Manson
>>>         <jeremymanson at google.com <mailto:jeremymanson at google.com>>
>>>         wrote:
>>>
>>>             Sherman,
>>>
>>>             If you had released it then (which you wouldn't have
>>>             been able to do, because you would have to wait another
>>>             two years for Java 7), you would have found that it
>>>             improved performance even with C2.  It is only
>>>             post-escape-analysis that the performance in C2 equalized.
>>>
>>>             Anyway, I think adding the StringBuilder variant and
>>>             deferring / dealing with the Appendable differently is
>>>             the right approach, FWIW.
>>>
>>>             Jeremy
>>>
>>>
>>>             On Thu, Mar 20, 2014 at 10:25 AM, Xueming Shen
>>>             <xueming.shen at oracle.com
>>>             <mailto:xueming.shen at oracle.com>> wrote:
>>>
>>>                 2009? I do have something similar back to 2009 :-)
>>>
>>>                 http://cr.openjdk.java.net/~sherman/regex_replace/webrev/
>>>                 <http://cr.openjdk.java.net/%7Esherman/regex_replace/webrev/>
>>>
>>>                 Then the ball was dropped around the discussion of
>>>                 whether or not
>>>                 the IOE should be thrown.
>>>
>>>                 But if we are going to/have to have explicit
>>>                 StringBuilder/Buffer pair
>>>                 anyway, then we can keep the Appendable version as
>>>                 private for now
>>>                 and deal with the StringBuilder and Appendable as
>>>                 two separate
>>>                 issues.
>>>
>>>                 -Sherman
>>>
>>>
>>>                 On 03/20/2014 09:52 AM, Jeremy Manson wrote:
>>>
>>>                     That's definitely an improvement - I think that
>>>                     when I wrote this (circa
>>>                     2009), I didn't think about Appendable.
>>>
>>>                     I take it my argument convinced someone?  :)
>>>
>>>                     Jeremy
>>>
>>>
>>>                     On Thu, Mar 20, 2014 at 1:32 AM, Peter
>>>                     Levart<peter.levart at gmail.com
>>>                     <mailto:peter.levart at gmail.com>>wrote:
>>>
>>>                         On 03/19/2014 06:51 PM, Jeremy Manson wrote:
>>>
>>>                             I'm told that the diff didn't make it.
>>>                              I've put it in a Google drive
>>>                             folder...
>>>
>>>                             https://drive.google.com/file/d/0B_GaXa6O4K5LY3Y0aHpranM3aEU/
>>>                             edit?usp=sharing
>>>
>>>                             Jeremy
>>>
>>>                         Hi Jeremy,
>>>
>>>                         Your factoring-out of expandReplacement()
>>>                         method exposed an opportunity to
>>>                         further optimize the code. Instead of
>>>                         creating intermediate StringBuilder
>>>                         instance for each expandReplacement() call,
>>>                         this method could append
>>>                         directly to resulting
>>>                         StringBuffer/StringBuilder, like in the
>>>                         following:
>>>
>>>                         http://cr.openjdk.java.net/~plevart/jdk9-dev/MatcherWithStringBuilder/
>>>                         <http://cr.openjdk.java.net/%7Eplevart/jdk9-dev/MatcherWithStringBuilder/>
>>>                         webrev.01/
>>>
>>>
>>>                         Regards, Peter
>>>
>>>
>>>
>>>                             On Wed, Mar 19, 2014 at 10:41 AM, Jeremy
>>>                             Manson<jeremymanson at google.com
>>>                             <mailto:jeremymanson at google.com>>
>>>                             wrote:
>>>
>>>                               Hi folks,
>>>
>>>                                 We've had this internally for a
>>>                                 while, and I keep meaning to bring it up
>>>                                 here.  The Matcher class has a few
>>>                                 public methods that take
>>>                                 StringBuffers,
>>>                                 and we've found it useful to add
>>>                                 similar versions that take
>>>                                 StringBuilders.
>>>
>>>                                 It has two benefits:
>>>
>>>                                 - Users don't have to convert from
>>>                                 one to the other when they want to use
>>>                                 the method in question.  The
>>>                                 symmetry is nice.
>>>
>>>                                 - The StringBuilder variants are
>>>                                 faster (if lock optimizations don't kick
>>>                                 in, which happens in the interpreter
>>>                                 and the client compiler).  For
>>>                                 interpreted / client-compiled code,
>>>                                 we saw something like a 25% speedup
>>>                                 on
>>>                                 String.replaceAll(), which calls
>>>                                 into this code.
>>>
>>>                                 Any interest?  Diff attached.
>>>
>>>                                 Jeremy
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>




More information about the core-libs-dev mailing list