RFR: JDK-8325898: ChoiceFormat returns erroneous result when formatting bad pattern
Justin Lu
jlu at openjdk.org
Thu Feb 22 17:41:53 UTC 2024
On Tue, 20 Feb 2024 23:14:26 GMT, Naoto Sato <naoto at openjdk.org> wrote:
>> Please review this PR which handles an edge case pattern bug with ChoiceFormat.
>>
>>
>> var d = new ChoiceFormat("0#foo|1#bar|baz|") // creates cFmt equivalent to "0.0#foo|1.0#bar|1.0#"
>> d.format(1) // unexpectedly returns ""
>>
>>
>> Not only does this lead to faulty formatting results, but breaks the ChoiceFormat class invariant of duplicate limits.
>>
>> It would have been preferable if either an exception was thrown when the ChoiceFormat was initially created, or the ChoiceFormat formatting 1 returned a value of "bar".
>>
>> For comparison,
>>
>> var d = new ChoiceFormat("0#foo|1#bar|baz") // creates cFmt equivalent to "0.0#foo|1.0#bar"
>> d.format(1) // returns "bar"
>>
>>
>> After this change, the first code snippet now returns "bar" when formatting 1 and discards the "baz" as the second code snippet does.
>>
>>
>> This PR includes a lot of cleanup to the applyPattern implementation, the specific fix for this bug is addressed with the following, on line 305,
>>
>> if (seg != Segment.FORMAT) {
>> // Discard incorrect portion and finish building cFmt
>> break;
>> }
>>
>> This prevents a limit/format entry from being built when the current parsing mode has not yet processed the limit and relation. The previous implementation would build an additional limit/format of 1 with an empty string. The `break` allows for the discarding of the incorrect portion and continuing. Alternatively an exception could have been thrown. For consistency with the second code snippet, the former was picked.
>>
>> https://github.com/openjdk/jdk/pull/17856 may be of interest, which is a semi-related specification fix.
>
> src/java.base/share/classes/java/text/ChoiceFormat.java line 332:
>
>> 330:
>> 331: // Used to explicitly define the segment mode while applying a pattern
>> 332: private enum Segment {
>
> Do we need a new enum? Would introducing a new enum solve something that cannot be done with the existing implementation?
No, we don't _need_ a new enum, although I prefer it for readability.
For example, I think the following is more clear:
- `Segment.LIMIT.bldr.toString()`over `segments[1].toString();`.
- `seg = Segment.LIMIT;` over `part = 0;`
Can also no longer do something such as `part = 2;` even if unlikely.
Although I can see the argument of not wanting an enum as `part` is either only `0` or `1`. Either is OK with me.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17883#discussion_r1499626656
More information about the core-libs-dev
mailing list