<i18n dev> RFR 8177552: Compact Number Formatting support

Fri Nov 23 10:25:14 UTC 2018

Hi Naoto,

 > I think DecimalFormat and CNF should behave the same, ie. 'E' should 
be treated as the exponent without a quote.

Personally I don't think that the exponential parsing should be 
supported by CompactNumberFormat, because the objective of compact 
numbers is to represent numbers in short form. So, parsing of number 
format like "1.05E4K" should not be expected from CompactNumberFormat, I 
am even doubtful that such forms ("1.05E4K") are used anywhere where 
exponential and compact form are together used. If formatting and 
parsing of exponential numbers are needed it should be done by 
DecimalFormat scientific instance *not *with the general number 
instance.So, I don't think that we should allow parsing of exponential 
numbers.Comments welcome.

Regards,
Nishit Jain
On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
> Hi Nishit,
>
> On 11/21/18 12:53 AM, Nishit Jain wrote:
>> Hi Naoto,
>>
>> Updated the webrev based on suggestions
>>
>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>
>> Changes made:
>> - Replaced List<String> with String[] to be added to the the resource 
>> bundles
>
> Good.
>
>> - refactored DecimalFormat.subparse() to be used by the CNF.parse(), 
>> to reduce code duplication.
>
> I presume CNF is calling package-private methods in DF to share the 
> same code. Some comments noting the sharing would be helpful.
>
>> - Also updated it with other changes as suggested in the comments
>
> Sorry I missed your question the last time:
> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
> >>> should avoid parsing exponential numbers? Or, should CNF.parse() be
> >>> modified to be consistent with DecimalFormat.parse() in this aspect?
>
> I think DecimalFormat and CNF should behave the same, ie. 'E' should 
> be treated as the exponent without a quote.
>
> Some more comments (all in CompactNumberFormat.java)
>
> line 807: expandAffix() seems to treat localizable special pattern 
> characters, but currently the implementation only cares for the minus 
> sign. Should other localizable pattern chars be taken care of, such as 
> percent sign?
>
> line 869, 888: Define what -1 means as a ret value.
>
> line 897: iterMultiplier be better all capitalized as it is a 
> constant. And it could be statically defined in the class to be shared 
> with other locations that use "10" for arithmetic operation.
>
> line 1531: Any possibility this could lead to divide-by-zero?
>
> Naoto
>
>>
>> Regards,
>> Nishit Jain
>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>> Hi Nishit,
>>>
>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>> Hi Naoto,
>>>>
>>>> Please check my comments inline.
>>>>
>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>> Hi Nishit,
>>>>>
>>>>> Here are my comments:
>>>>>
>>>>> - CLDRConverter: As the compact pattern no more employs 
>>>>> List<String>, can we eliminate stringListEntry/Element, and use 
>>>>> Array equivalent instead?
>>>> Since the CNF design does not put any limit on the size of compact 
>>>> pattern, so at the time of parsing the CLDR xmls using SAX parser, 
>>>> it becomes difficult to identify the size of array when the parent 
>>>> element of compact pattern is encountered, so I think it is better 
>>>> to keep the List<String> while extracting the resources.
>>>
>>> OK. However I'd not keep the List<String> format on generating the 
>>> resource bundle, as there is no reason to introduce yet another 
>>> bundle format other than the existing array of String.
>>>
>>>>>
>>>>> - CompactNumberFormat.java
>>>>>
>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>> OK
>>>>>
>>>>> line 268: The link points to 
>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>> OK. Changed it at line 165 also.
>>>>>
>>>>> line 855: no need to do toString(). length() can detect whether 
>>>>> it's empty or not.
>>>>>
>>>>> line 884: "Overloaded method" reads odd here. I'd prefer 
>>>>> specializing in the "given number" into either long or biginteger.
>>>> OK
>>>>>
>>>>> line 1500: subparseNumber() pretty much shares the same code with 
>>>>> DecimalFormat.subparse(). can they be merged?
>>>> The existing CNF.subParseNumber differs in the way parseIntegerOnly 
>>>> is handled, DecimalFormat.parse()/subparse() behaviour is 
>>>> unpredictable with parseIntegeronly = true when multipliers are 
>>>> involved (Please see JDK-8199223).
>>>>
>>>> Also, I had thought that the CNF.parse()/subparseNumber() should 
>>>> *not *parse the exponential notation e.g. while parsing "1.05E4K" 
>>>> the parsing should break at 'E' and returns 1.05, because 'E' 
>>>> should be considered as unparseable character for general number 
>>>> format pattern or compact number pattern, but this is not the case 
>>>> with DecimalFormat.parse(). The below DecimalFormat general number 
>>>> format instance
>>>>
>>>> NumberFormat nf =  NumberFormat.getNumberInstance();
>>>> nf.parse("1.05E4")
>>>>
>>>> Successfully parse the string and returns 10500. The same behaviour 
>>>> is there with other DecimalFormat instances also e.g. currency 
>>>> instance.
>>>>
>>>> Do you think this is an issue with DecimalFormat.parse() and CNF 
>>>> should avoid parsing exponential numbers? Or, should CNF.parse() be 
>>>> modified to be consistent with DecimalFormat.parse() in this aspect?
>>>
>>> No, I understand there are differences. But I see a lot of 
>>> duplicated piece of code which I would like to eliminate.
>>>
>>>>
>>>>>
>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply calls 
>>>>> super. No need to override them.
>>>> Since setters are overridden, I think that it is better to override 
>>>> getters also (even if they are just calling super and have same 
>>>> javadoc) to keep them at same level. But, if you see no point in 
>>>> keeping them in CNF, I will remove them. Does that need CSR change?
>>>
>>> I don't see any point for override. I don't think there needs a CSR, 
>>> but better ask Joe about it.
>>>
>>>>>
>>>>> line 2231: You need to test the type before cast. Otherwise 
>>>>> ClassCastException may be thrown.
>>>> The type is checked in the superclass equals method getClass() != 
>>>> obj.getClass(), so I think there is no need to check the type here.
>>>
>>> OK.
>>>
>>> Naoto
>>>
>>>>
>>>> Regards,
>>>> Nishit Jain
>>>>>
>>>>> Naoto
>>>>>
>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Please review this non trivial feature addition to NumberFormat API.
>>>>>>
>>>>>> The existing NumberFormat API provides locale based support for 
>>>>>> formatting and parsing numbers which includes formatting decimal, 
>>>>>> percent, currency etc, but the support for formatting a number 
>>>>>> into a human readable or compact form is missing. This RFE adds 
>>>>>> that feature to format a decimal number in a compact format (e.g. 
>>>>>> 1000 -> 1K, 1000000 -> 1M in en_US locale) , which is useful for 
>>>>>> the environment where display space is limited, so that the 
>>>>>> formatted string can be displayed in that limited space. It is 
>>>>>> defined by LDML's specification for Compact Number Formats.
>>>>>>
>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats 
>>>>>>
>>>>>>
>>>>>>
>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>> Webrev: 
>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>
>>>>>> Request to please help review the the change.
>>>>>>
>>>>>> Regards,
>>>>>> Nishit Jain
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>