RFR: 8363972: Loose matching of dash/minusSign in number parsing

Francesco Andreuzzi duke at openjdk.org
Thu Jul 31 21:00:59 UTC 2025


On Thu, 31 Jul 2025 18:41:47 GMT, Naoto Sato <naoto at openjdk.org> wrote:

> Enabling lenient minus sign matching when parsing numbers. In some locales, e.g. Finnish, the default minus sign is the Unicode "Minus Sign" (U+2212), which is not the "Hyphen Minus" (U+002D) that users type in from keyboard. Thus the parsing of user input numbers may fail. This change utilizes CLDR's `parseLenient` element for minus signs and loosely matches them with the hyphen-minus so that user input numbers can parse. As this is a behavioral change, a corresponding CSR has been drafted.

make/jdk/src/classes/build/tools/cldrconverter/LDMLParseHandler.java line 851:

> 849:             {
> 850:                 String level = attributes.getValue("level");
> 851:                 if (level != null && level.equals("lenient")) {

This could be slightly simplified:
Suggestion:

                if ("lenient".equals(level)) {

src/java.base/share/classes/java/text/DecimalFormat.java line 3526:

> 3524:         }
> 3525: 
> 3526:         if (!parseStrict) {

Possible early return here: by inverting the `if` you could `return text.regionMatches(...)` immediately, and remove one level of indentation from the big block in L3527-3543

src/java.base/share/classes/java/text/DecimalFormatSymbols.java line 1002:

> 1000: 
> 1001:         if (loadNumberData(locale) instanceof Object[] d &&
> 1002:             d[0] instanceof String[] numberElements) {

Should the size be validated here, before accessing `d[0]`?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246351119
PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246365822
PR Review Comment: https://git.openjdk.org/jdk/pull/26580#discussion_r2246361036


More information about the core-libs-dev mailing list