RFR 8230365 : Pattern for a control-char matches non-control characters
Ivan Gerasimov
ivan.gerasimov at oracle.com
Thu Aug 29 23:39:35 UTC 2019
Hello!
In a regular expression pattern a sequence of the form \\cx is allowed
to specify a control character that corresponds to the name char x.
Current implementation has a few issues with that:
1) It allows x to be just any character, including non-printable ones;
2) The produced regexp may correspond to a non-control characters;
3) The expression is case-sensitive, so, for example \\cA differs from
\\ca, while they both have to produce ctrl-A.
It is proposed to make parsing more strict and reject invalid values of
x, and also clarify the documentation to explicitly list valid values of x.
If we agree on this proposal, then a CSR will probably need to be filed
to capture the changes in the regexp parsing.
Would you please help review the fix?
BUGURL: https://bugs.openjdk.java.net/browse/JDK-8230365
WEBREV: http://cr.openjdk.java.net/~igerasim/8230365/00/webrev/
--
With kind regards,
Ivan Gerasimov
More information about the core-libs-dev
mailing list