<i18n dev> RFR: 8331485: Odd Results when Parsing Scientific Notation with Large Exponent [v2]

Justin Lu jlu at openjdk.org
Mon May 6 17:52:07 UTC 2024


On Sun, 5 May 2024 20:51:10 GMT, Axel Hauschulte <duke at openjdk.org> wrote:

>> Justin Lu has updated the pull request incrementally with two additional commits since the last revision:
>> 
>>  - correct other test comment
>>  - reflect review
>
> Hello, I filed [JDK-8331485](https://bugs.openjdk.org/browse/JDK-8331485). Thank you for addressing this bug so quickly.
> 
> I have a thought/concern regarding the handling of exponents that exceed `Long.MAX_VALUE` in this PR:
> 
>> If the value of the exponent exceeds `Long.MAX_VALUE`, the parsed value is equal to the mantissa. Both results are confusing and incorrect.
>> 
>> For example,
>> 
>> ```
>> NumberFormat fmt = NumberFormat.getInstance(Locale.US);
>> fmt.parse(".1E2147483648"); // returns 0.0
>> fmt.parse(".1E9223372036854775808"); // returns 0.1
>> // For comparison
>> Double.parseDouble(".1E2147483648"); // returns Infinity
>> Double.parseDouble(".1E9223372036854775808"); // returns Infinity
>> ```
> 
> The method [`parse(String, ParsePosition)`](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/text/DecimalFormat.html#parse(java.lang.String,java.text.ParsePosition)) uses the [`ParsePosition`](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/text/ParsePosition.html) object as an input and output parameter to determine at what position the parsing should start as well as to communicate up to which position the input string has been consumed during the parsing. (This can be very handy if you use different [`Format`](https://docs.oracle.com/en/java/javase/22/docs/api/java.base/java/text/Format.html)s to parse through a string.)
> 
> For example, if there is a method like this
> 
> static void parseNumber(String s) {
>     NumberFormat numberFormat = NumberFormat.getInstance(Locale.US);
>     ParsePosition parsePosition = new ParsePosition(0);
>     Number parseResult = numberFormat.parse(s, parsePosition);
>     System.out.println(STR."numberFormat.parse("{s}") -> {parseResult}; parsePosition: {parsePosition}");
> }
> 
> `parseNumber("0.123E1XYZ")` will parse the provided string from the beginning to position 7, ignoring the letters at the end of the string. The resulting `Double` value is therefore 1.23 and `parsePosition.getIndex()` returns 7.
> 
> Having an exponent that exceeds `Long.MAX_VALUE`, for instance `parseNumber("0.123E9223372036854775808")`, the current implementation of `DecimalFormat` in JDK 22 does the following: Parse the provided string from the beginning to position 5, ignoring the exponent (because it is too long). The resulting `Double` is therefore 0.123 and `parsePosition.getIndex()` returns 5.
> 
> The solution implemented in this PR would produce a parsing result of `Double.POSITIVE_INFINITY`, however, `parsePosition.getIndex()` would sti...

Hi @ahauschulte, thanks for bringing up your concern, that's a good catch.

As you stated, I agree that the latter solution would be the ideal one here. The parse position index should reflect the actual exponent consumed, even if the exponent exceeds `Long.MAX_VALUE`.

In https://github.com/openjdk/jdk/pull/19075/commits/25782781394dc7a2e0c39605515ab14be41649b0, the behavior of parsing such an exponent value should now reflect this.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19075#issuecomment-2096592065


More information about the i18n-dev mailing list