Regex named-group and backreference syntax

Mark Reinhold mr at sun.com
Wed Sep 2 16:43:40 UTC 2009


> Date: Wed, 02 Sep 2009 01:58:46 -0700
> From: uncle.alice at gmail.com

> On Wed, Sep 2, 2009 at 12:15 AM, Xueming Shen<Xueming.Shen at sun.com> wrote:
>> It would be an "ambiguity" (and then confused) only if we
>> had the \k<n> and $<n> as the legally supported group
>> reference syntax:-) That said I have to admit that it does
>> not have any value-add to allow the a group name begins
>> with a digit character. So if we have a consensus I would
>> be happy to change the spec/implementation to dis-allow the
>> digit letter started group name.

That seems reasonable to me.

> ...
> 
> Anyway, I think consistency with other flavors is more important than
> internal consistency in this case.  Every flavor uses angle brackets
> within the regex, but Perl uses $+{name} in the replacement string,
> while .NET and JRegex both use ${name}.  I think anyone who comes over
> to java.util.regex with previous regex experience is more likely to
> expect that syntax than anything else.

I think this line of reasoning has merit, though I'm not sure that JRegex
is an interesting data point any more.

Python uses yet a third replacement-string syntax, \g<name>, so that's a
data point in favor of angle brackets.  The \g prefix, though, wouldn't
fit that well with the rest of the existing Java replacement syntax.

I'd say ${name} is the best option here given that it will be familiar,
at least, to Perl and .NET developers.

- Mark



More information about the core-libs-dev mailing list