Matcher.replaceAll(Function<MatchResult, String> f) [was: Re: hg: lambda/lambda/jdk: Pattern.splitAsStream.]
Paul Sandoz
paul.sandoz at oracle.com
Mon Apr 22 05:54:20 PDT 2013
Hi Jürgen,
Three issues:
- we should probably include replaceFirst
- we need to use different method names since replaceAll(null) is now ambiguous
- need tests :-) (see jdk/tests/java/util/regex/RegExTest.java)
While these are nice to have i am not sure they carry their weight given the time constraints we have. If you can help us provide a more complete solution the better chance we have of getting this into JDK8.
Thanks,
Paul.
On Apr 19, 2013, at 12:59 AM, jk at blackdown.de wrote:
> Hi Paul,
>
> Paul Sandoz <paul.sandoz at oracle.com> writes:
>
>> Hi Jürgen,
>>
>> That seems useful as a more general approach than Matcher.replaceAll(String ) e.g.
>>
>> Matcher.replaceAll(Function<MatchResult, String> f)
>>
>> Ben, thoughts?
>
> like this?
>
> # HG changeset patch
> # User Jürgen Kreileder <jk at blackdown.de>
> # Date 1366322703 -7200
> # Node ID 59766f458701af5fbb23d195dd48a928505f3306
> # Parent 3ec06ef568a8ded0a7ecc7624df9d3a025dad6bc
> Matcher.replaceAll(Function<MatchResult, String> f)
>
> diff --git a/src/share/classes/java/util/regex/Matcher.java b/src/share/classes/java/util/regex/Matcher.java
> --- a/src/share/classes/java/util/regex/Matcher.java
> +++ b/src/share/classes/java/util/regex/Matcher.java
> @@ -25,6 +25,7 @@
>
> package java.util.regex;
>
> +import java.util.function.Function;
>
> /**
> * An engine that performs match operations on a {@link java.lang.CharSequence
> @@ -916,6 +917,54 @@
> }
>
> /**
> + * Replaces every subsequence of the input sequence that matches the
> + * pattern with the string returned by the given replacement function.
> + *
> + * <p> This method first resets this matcher. It then scans the input
> + * sequence looking for matches of the pattern. Characters that are not
> + * part of any match are appended directly to the result string; each match
> + * is replaced in the result by the string returned by the replacement
> + * function. The replacement strings may contain references to captured
> + * subsequences as in the {@link #appendReplacement appendReplacement}
> + * method.
> + *
> + * <p> Note that backslashes (<tt>\</tt>) and dollar signs (<tt>$</tt>) in
> + * the string returned by the replacement function may cause the results to
> + * be different than if they were being treated as a literal strings. Dollar
> + * signs may be treated as references to captured subsequences as described
> + * above, and backslashes are used to escape literal characters in the
> + * replacement string.
> + *
> + * <p> Given the regular expression <tt>(\\w)(\\w*)</tt>, the input
> + * <tt>"paTTern maTcher"</tt>, and the replacement function
> + * <tt>m -> m.group(1).toUpperCase() + m.group(2).toLowerCase()</tt>, an
> + * invocation of this method on a matcher for that expression would yield
> + * the string <tt>"Pattern Matcher"</tt>. </p>
> + *
> + * <p> Invoking this method changes this matcher's state. If the matcher
> + * is to be used in further matching operations then it should first be
> + * reset. </p>
> + *
> + * @param f
> + * The function providing replacement strings
> + * @return The string constructed by replacing each matching subsequence
> + * by the replacement string provide by the given function,
> + * substituting captured subsequences as needed
> + * @since 1.8
> + */
> + public String replaceAll(Function<MatchResult, String> f) {
> + reset();
> + if (find()) {
> + StringBuffer sb = new StringBuffer();
> + do {
> + appendReplacement(sb, f.apply(this));
> + } while (find());
> + return appendTail(sb).toString();
> + }
> + return text.toString();
> + }
> +
> + /**
> * Replaces the first subsequence of the input sequence that matches the
> * pattern with the given replacement string.
> *
> ==
>
>
> Juergen
>
>
>> On Apr 8, 2013, at 6:59 PM, jk at blackdown.de wrote:
>>
>>> Hi Paul,
>>>
>>> it would be nice if Pattern/Matcher offered a terse way to loop over all
>>> matches in a string and replace them via a callback.
>>>
>>> E.g. I'm currently using something like this:
>>>
>>> private static final PatternAndReplacement PASS2 = new PatternAndReplacement(
>>> Pattern.compile(" ( "
>>> + " \\A \\p{Punct}*" // start of title…
>>> + " |"
>>> + " [:.;?!]\\ +" // or of subsentence…
>>> + " | "
>>> + " \\ ['\"“‘(\\[] \\ *" // or of inserted subphrase…
>>> + ")"
>>> + "(" + SMALL_WORDS + ") \\b", // … followed by small word
>>> Pattern.COMMENTS | Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CHARACTER_CLASS),
>>> m -> Matcher.quoteReplacement(m.group(1) + capitalize(m.group(2))));
>>>
>>> with PatternAndReplacement being
>>>
>>> private static class PatternReplacement implements Function<String, String> {
>>> private final Pattern pattern;
>>> private final Function<MatchResult, String> function;
>>>
>>> PatternReplacement(final Pattern p, final Function<MatchResult, String> f) {
>>> pattern = p;
>>> function = f;
>>> }
>>>
>>> @Override
>>> public final String apply(final String s) {
>>> Matcher m = pattern.matcher(s);
>>> if (m.find()) {
>>> StringBuffer sb = new StringBuffer(s.length());
>>> do {
>>> m.appendReplacement(sb, function.apply(m));
>>> } while (m.find());
>>> return m.appendTail(sb).toString();
>>> }
>>> return s;
>>> }
>>> }
>>>
>>> Any plans for something like this?
>>>
>>>
>>> Jürgen
>>>
>>>
>>> paul.sandoz at oracle.com writes:
>>>
>>>> Changeset: 526131346981
>>>> Author: psandoz
>>>> Date: 2013-04-08 17:16 +0200
>>>> URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/526131346981
>>>>
>>>> Pattern.splitAsStream.
>>>> Contributed-by: Ben Evans <benjamin.john.evans at gmail.com>
>>>>
>>>> ! src/share/classes/java/util/regex/Pattern.java
>>>> + test-ng/tests/org/openjdk/tests/java/util/regex/PatternTest.java
>>
>
> --
> https://blackdown.de/
More information about the lambda-dev
mailing list