<i18n dev> RFR 8177552: Compact Number Formatting support

Wed Nov 28 11:46:42 UTC 2018

Thanks Naoto,

Updated.

http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.04/

Regards,
Nishit Jain
On 27-11-2018 20:55, naoto.sato at oracle.com wrote:
> Hi Nishit,
>
> On 11/26/18 11:11 PM, Nishit Jain wrote:
>> Hi Naoto,
>>
>> On 26-11-2018 21:01, naoto.sato at oracle.com wrote:
>>> Hi Nishit,
>>>
>>> On 11/26/18 12:41 AM, Nishit Jain wrote:
>>>> Hi Naoto,
>>>>
>>>> To add to my previous mail comment, the DecimalFormat spec also 
>>>> says that
>>>>
>>>> /*"DecimalFormat can be instructed to format and parse scientific 
>>>> notation only via a pattern; there is currently no factory method 
>>>> that creates a scientific notation format. In a pattern, the 
>>>> exponent character immediately followed by one or more digit 
>>>> characters indicates scientific notation. "
>>>>
>>>> */That is, exponent formatting and parsing is instructed only via a 
>>>> scientific notation pattern and I think should not be there with 
>>>> *general number* formatting.
>>>
>>> I am not sure the quoted sentence should be interpreted that way. My 
>>> understanding is that the section means there is no public 
>>> NumberFormat.getScientificInstance() method (cf. line 601 at 
>>> NumberFormat.java), so that users will have to use 'E' in their 
>>> pattern string.
>>>
>>> Anyway, my point is that if you prefer to treat the scientific 
>>> notation differently between DecimalFormat and CompactDecimalFormat, 
>>> then it will need to be clarified in the spec. Personally I agree 
>>> that it is not practical to interpret E in the CNF.
>> OK. If it is better to specify the parsing behavior w.r.t. the 
>> exponential numbers, I have added a statement in the parse() API.
>>
>> */"CompactNumberFormat parse does not allow parsing exponential 
>> number strings. For example, parsing a string "1.05E4K" in US locale 
>> breaks at character 'E' and returns 1.05."/*
>
> That's better. I'd replace "exponential number strings" with 
> "scientific notations." You may want to check the final wording with 
> English natives.
>
>>
>> Updated the webrev
>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.03/
>>
>> Will also update the CSR and refinalize it.
>
> Thanks.
>
> Naoto
>
>>
>> Regards,
>> Nishit Jain
>>>
>>> Naoto
>>>
>>>>
>>>> Updated webrev based on the other comments
>>>>
>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/
>>>>
>>>>  > Some more comments (all in CompactNumberFormat.java)
>>>>
>>>>  > line 807: expandAffix() seems to treat localizable special 
>>>> pattern characters, but currently the implementation only cares for 
>>>> the minus sign. Should other localizable pattern chars be taken 
>>>> care of, such as percent sign?
>>>> - Other special characters like '%' percent sign are not allowed as 
>>>> per CNF compact pattern spec
>>>>
>>>>  > line 869, 888: Define what -1 means as a ret value.
>>>> - OK.
>>>>
>>>>  > line 897: iterMultiplier be better all capitalized as it is a 
>>>> constant. And it could be statically defined in the class to be 
>>>> shared with other locations that use "10" for arithmetic operation.
>>>> - OK, made it static final and renamed it as RANGE_MULTIPLIER
>>>>
>>>>  > line 1531: Any possibility this could lead to divide-by-zero?
>>>> - None which I am aware of, unless you are pointing to the issue 
>>>> like JDK-8211161, which we know is not an issue.
>>>>
>>>> Regards,
>>>> Nishit Jain
>>>> On 23-11-2018 15:55, Nishit Jain wrote:
>>>>> Hi Naoto,
>>>>>
>>>>> > I think DecimalFormat and CNF should behave the same, ie. 'E' 
>>>>> should be treated as the exponent without a quote.
>>>>>
>>>>> Personally I don't think that the exponential parsing should be 
>>>>> supported by CompactNumberFormat, because the objective of compact 
>>>>> numbers is to represent numbers in short form. So, parsing of 
>>>>> number format like "1.05E4K" should not be expected from 
>>>>> CompactNumberFormat, I am even doubtful that such forms 
>>>>> ("1.05E4K") are used anywhere where exponential and compact form 
>>>>> are together used. If formatting and parsing of exponential 
>>>>> numbers are needed it should be done by DecimalFormat scientific 
>>>>> instance *not *with the general number instance.So, I don't think 
>>>>> that we should allow parsing of exponential numbers.Comments welcome.
>>>>>
>>>>> Regards,
>>>>> Nishit Jain
>>>>> On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
>>>>>> Hi Nishit,
>>>>>>
>>>>>> On 11/21/18 12:53 AM, Nishit Jain wrote:
>>>>>>> Hi Naoto,
>>>>>>>
>>>>>>> Updated the webrev based on suggestions
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>>>>>>
>>>>>>> Changes made:
>>>>>>> - Replaced List<String> with String[] to be added to the the 
>>>>>>> resource bundles
>>>>>>
>>>>>> Good.
>>>>>>
>>>>>>> - refactored DecimalFormat.subparse() to be used by the 
>>>>>>> CNF.parse(), to reduce code duplication.
>>>>>>
>>>>>> I presume CNF is calling package-private methods in DF to share 
>>>>>> the same code. Some comments noting the sharing would be helpful.
>>>>>>
>>>>>>> - Also updated it with other changes as suggested in the comments
>>>>>>
>>>>>> Sorry I missed your question the last time:
>>>>>> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
>>>>>> >>> should avoid parsing exponential numbers? Or, should 
>>>>>> CNF.parse() be
>>>>>> >>> modified to be consistent with DecimalFormat.parse() in this 
>>>>>> aspect?
>>>>>>
>>>>>> I think DecimalFormat and CNF should behave the same, ie. 'E' 
>>>>>> should be treated as the exponent without a quote.
>>>>>>
>>>>>> Some more comments (all in CompactNumberFormat.java)
>>>>>>
>>>>>> line 807: expandAffix() seems to treat localizable special 
>>>>>> pattern characters, but currently the implementation only cares 
>>>>>> for the minus sign. Should other localizable pattern chars be 
>>>>>> taken care of, such as percent sign?
>>>>>>
>>>>>> line 869, 888: Define what -1 means as a ret value.
>>>>>>
>>>>>> line 897: iterMultiplier be better all capitalized as it is a 
>>>>>> constant. And it could be statically defined in the class to be 
>>>>>> shared with other locations that use "10" for arithmetic operation.
>>>>>>
>>>>>> line 1531: Any possibility this could lead to divide-by-zero?
>>>>>>
>>>>>> Naoto
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nishit Jain
>>>>>>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>>>>>>> Hi Nishit,
>>>>>>>>
>>>>>>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>>>>>>> Hi Naoto,
>>>>>>>>>
>>>>>>>>> Please check my comments inline.
>>>>>>>>>
>>>>>>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>>>>>>> Hi Nishit,
>>>>>>>>>>
>>>>>>>>>> Here are my comments:
>>>>>>>>>>
>>>>>>>>>> - CLDRConverter: As the compact pattern no more employs 
>>>>>>>>>> List<String>, can we eliminate stringListEntry/Element, and 
>>>>>>>>>> use Array equivalent instead?
>>>>>>>>> Since the CNF design does not put any limit on the size of 
>>>>>>>>> compact pattern, so at the time of parsing the CLDR xmls using 
>>>>>>>>> SAX parser, it becomes difficult to identify the size of array 
>>>>>>>>> when the parent element of compact pattern is encountered, so 
>>>>>>>>> I think it is better to keep the List<String> while extracting 
>>>>>>>>> the resources.
>>>>>>>>
>>>>>>>> OK. However I'd not keep the List<String> format on generating 
>>>>>>>> the resource bundle, as there is no reason to introduce yet 
>>>>>>>> another bundle format other than the existing array of String.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> - CompactNumberFormat.java
>>>>>>>>>>
>>>>>>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>>>>>>> OK
>>>>>>>>>>
>>>>>>>>>> line 268: The link points to 
>>>>>>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>>>>>>> OK. Changed it at line 165 also.
>>>>>>>>>>
>>>>>>>>>> line 855: no need to do toString(). length() can detect 
>>>>>>>>>> whether it's empty or not.
>>>>>>>>>>
>>>>>>>>>> line 884: "Overloaded method" reads odd here. I'd prefer 
>>>>>>>>>> specializing in the "given number" into either long or 
>>>>>>>>>> biginteger.
>>>>>>>>> OK
>>>>>>>>>>
>>>>>>>>>> line 1500: subparseNumber() pretty much shares the same code 
>>>>>>>>>> with DecimalFormat.subparse(). can they be merged?
>>>>>>>>> The existing CNF.subParseNumber differs in the way 
>>>>>>>>> parseIntegerOnly is handled, DecimalFormat.parse()/subparse() 
>>>>>>>>> behaviour is unpredictable with parseIntegeronly = true when 
>>>>>>>>> multipliers are involved (Please see JDK-8199223).
>>>>>>>>>
>>>>>>>>> Also, I had thought that the CNF.parse()/subparseNumber() 
>>>>>>>>> should *not *parse the exponential notation e.g. while parsing 
>>>>>>>>> "1.05E4K" the parsing should break at 'E' and returns 1.05, 
>>>>>>>>> because 'E' should be considered as unparseable character for 
>>>>>>>>> general number format pattern or compact number pattern, but 
>>>>>>>>> this is not the case with DecimalFormat.parse(). The below 
>>>>>>>>> DecimalFormat general number format instance
>>>>>>>>>
>>>>>>>>> NumberFormat nf = NumberFormat.getNumberInstance();
>>>>>>>>> nf.parse("1.05E4")
>>>>>>>>>
>>>>>>>>> Successfully parse the string and returns 10500. The same 
>>>>>>>>> behaviour is there with other DecimalFormat instances also 
>>>>>>>>> e.g. currency instance.
>>>>>>>>>
>>>>>>>>> Do you think this is an issue with DecimalFormat.parse() and 
>>>>>>>>> CNF should avoid parsing exponential numbers? Or, should 
>>>>>>>>> CNF.parse() be modified to be consistent with 
>>>>>>>>> DecimalFormat.parse() in this aspect?
>>>>>>>>
>>>>>>>> No, I understand there are differences. But I see a lot of 
>>>>>>>> duplicated piece of code which I would like to eliminate.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply 
>>>>>>>>>> calls super. No need to override them.
>>>>>>>>> Since setters are overridden, I think that it is better to 
>>>>>>>>> override getters also (even if they are just calling super and 
>>>>>>>>> have same javadoc) to keep them at same level. But, if you see 
>>>>>>>>> no point in keeping them in CNF, I will remove them. Does that 
>>>>>>>>> need CSR change?
>>>>>>>>
>>>>>>>> I don't see any point for override. I don't think there needs a 
>>>>>>>> CSR, but better ask Joe about it.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> line 2231: You need to test the type before cast. Otherwise 
>>>>>>>>>> ClassCastException may be thrown.
>>>>>>>>> The type is checked in the superclass equals method getClass() 
>>>>>>>>> != obj.getClass(), so I think there is no need to check the 
>>>>>>>>> type here.
>>>>>>>>
>>>>>>>> OK.
>>>>>>>>
>>>>>>>> Naoto
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nishit Jain
>>>>>>>>>>
>>>>>>>>>> Naoto
>>>>>>>>>>
>>>>>>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Please review this non trivial feature addition to 
>>>>>>>>>>> NumberFormat API.
>>>>>>>>>>>
>>>>>>>>>>> The existing NumberFormat API provides locale based support 
>>>>>>>>>>> for formatting and parsing numbers which includes formatting 
>>>>>>>>>>> decimal, percent, currency etc, but the support for 
>>>>>>>>>>> formatting a number into a human readable or compact form is 
>>>>>>>>>>> missing. This RFE adds that feature to format a decimal 
>>>>>>>>>>> number in a compact format (e.g. 1000 -> 1K, 1000000 -> 1M 
>>>>>>>>>>> in en_US locale) , which is useful for the environment where 
>>>>>>>>>>> display space is limited, so that the formatted string can 
>>>>>>>>>>> be displayed in that limited space. It is defined by LDML's 
>>>>>>>>>>> specification for Compact Number Formats.
>>>>>>>>>>>
>>>>>>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>>>>>>> Webrev: 
>>>>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>>>>>>
>>>>>>>>>>> Request to please help review the the change.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Nishit Jain
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>