[NEW BUG] NumberFormat.parse fails in some scenarios

Wed Aug 26 14:18:16 UTC 2020

Hi,

A colleague of mine (filipe.silvestrim at innogames.com) approached me today that his code wasn’t working that converted a currency String into cents.
Apparently, the code worked with Java 8 while it didn’t with 11+.

public class Main {

	public static void main(String[] args) throws IOException {
		// System.setProperty("java.locale.providers", "JRE");
		System.out.println(getPriceInCents(Locale.GERMANY, "9,99 €"));
	}

	static int getPriceInCents(Locale locale, String price) {
		try {
			DecimalFormat format = (DecimalFormat) NumberFormat.getCurrencyInstance(locale);
			Number number = format.parse(price);
			return (int) (number.doubleValue() * 100);
		} catch (ParseException e) {
                                           // This should be thrown on JDK 9+
			System.out.println(e);
		}
		return 0;
	}

}

After some digging I think this is caused by the changes done for JDK-8008577[1].
When I change the java.locale.providers property to "JRE" for example, it works again.

My investigations so far revealed that apparently the CLDR number pattern for the currency slightly differs.

I created breakpoints in sun.util.locale.provider.NumberFormatProviderImpl::getInstance() to display some things:

        LocaleProviderAdapter adapter = LocaleProviderAdapter.forType(type);
        String[] numberPatterns = adapter.getLocaleResources(override).getNumberPatterns();
        DecimalFormatSymbols symbols = DecimalFormatSymbols.getInstance(override);
        int entry = (choice == INTEGERSTYLE) ? NUMBERSTYLE : choice;
        DecimalFormat format = new DecimalFormat(numberPatterns[entry], symbols);

	// CLDR (type) 
	// #,##0.00 ¤ (numberPatterns[entry])
	// [35,44,35,35,48,46,48,48,-62,-96,-62,-92] (numberPatterns[entry] in bytes)

	//
	// JRE type
	// #,##0.00 ¤;-#,##0.00 ¤ (numberPatterns[entry])
	// [35,44,35,35,48,46,48,48,32,-62,-92,59,45,35,44,35,35,48,46,48,48,32,-62,-92] (numberPatterns[entry] in bytes)

The JRE one includes the negative pattern, but the more interesting bit is that apparently the spacing differs here.
For JRE it seems to be a normal space (the 32), but for CLDR it's showing [-62, -96] which seems to be a non breaking space aka nbsp.

Ultimately this leads to a check failing in DecimalFormat when parsing the string "9,99 €" that obviously includes a normal space.

            if (gotPositive) {
                // the regionMatches will return false because nbsp != space
                gotPositive = text.regionMatches(position,positiveSuffix,0,
                                                 positiveSuffix.length());
            }

Which itself leads to the following in our case:

        // fail if neither or both
        if (gotPositive == gotNegative) {
            parsePosition.errorIndex = position;
            // We hit this part here which causes the parsing to fail
            return false;
        }

There are workarounds - e.g. by setting java.locale.providers as already mentioned or setting format.setPositiveSuffix(" €"); to fix this particular case.

Is this a bug or a feature or are we missing something?

In case this is an actual bug we would appreciate a "reported-by" mentioning in an eventual fix.

Thanks in advance. I do hope you can follow my thoughts in this email.

[1] https://bugs.openjdk.java.net/browse/JDK-8008577

Cheers,
Christoph