RFR [S] 8131034: Cleanup in j.u.regex.Pattern.quote()

Ivan Gerasimov ivan.gerasimov at oracle.com
Mon Jul 13 23:51:26 UTC 2015


Thanks Martin!

On 13.07.2015 21:26, Martin Buchholz wrote:
> I agree that normalizedD looks like a typo (but it probably was not), 
> but "normalized" is too ambiguous (there are too many different kinds 
> of normalization) so a precise term like NFD should be used.  Maybe we 
> should add a reference to http://unicode.org/reports/tr15/
>
Okay, what if we have

      /**
-     * The pattern is converted to normalized form and then a pure group
-     * is constructed to match canonical equivalences of the characters.
+     * The pattern is converted to normalized form (NFD, canonical
+     * decomposition) and then a pure group is constructed to match
+     * canonical equivalences of the characters.
+     *
+     * @see java.text.Normalizer.Form.NFD
        */

We've already got a link to 
http://www.unicode.org/unicode/reports/tr15/tr15-23.html  from the doc 
for java.text.Normalizer.NFD:
http://docs.oracle.com/javase/8/docs/api/java/text/Normalizer.Form.html

I'd rather not duplicate the reference in that comment.

Sincerely yours,
Ivan


> On Mon, Jul 13, 2015 at 4:29 AM, Ivan Gerasimov 
> <ivan.gerasimov at oracle.com <mailto:ivan.gerasimov at oracle.com>> wrote:
>
>     Thank you Paul for your review!
>
>     On 13.07.2015 12:57, Paul Sandoz wrote:
>
>         On Jul 12, 2015, at 6:32 PM, Ivan Gerasimov
>         <ivan.gerasimov at oracle.com <mailto:ivan.gerasimov at oracle.com>>
>         wrote:
>
>             Hello!
>
>             There's a minor issue in the current implementation of
>             Pattern.quote() with possible numeric overflow when
>             calculating initial capacity of StringBuilder.
>             With the fix, some slight optimizations were done and a
>             few typos were fixed.
>
>             Would you please help review the fix?
>
>             BUGURL: https://bugs.openjdk.java.net/browse/JDK-8131034
>             WEBREV:
>             http://cr.openjdk.java.net/~igerasim/8131034/00/webrev/
>             <http://cr.openjdk.java.net/%7Eigerasim/8131034/00/webrev/>
>
>         Looks ok.
>
>               /**
>         -     * The pattern is converted to normalizedD form and then
>         a pure group
>         +     * The pattern is converted to normalized form and then a
>         pure group
>                * is constructed to match canonical equivalences of the
>         characters.
>                */
>
>         The "D" in "normalizedD form" may well be short hand for
>         "canonical decomposition" (Normalizer.Form.NFD) as in
>         "normalized canonical decomposition form".
>
>     Hm.  Still looks like a typo to me.
>     A few lines below normalizedPattern has no extra D.
>     Another normalization form, NFKD, also ends with D, so normalizedD
>     wouldn't uniquely identify NFD.
>     grep didn't find other occurrences of normalizedD abbreviation in
>     JDK :)
>
>     Sincerely yours,
>     Ivan
>
>         Paul.
>
>             Sincerely yours,
>             Ivan
>
>
>
>
>




More information about the core-libs-dev mailing list