<i18n dev> RFR 8177552: Compact Number Formatting support

Wed Nov 28 14:02:21 UTC 2018

Looks good to me.

Naoto

On 11/28/18 3:46 AM, Nishit Jain wrote:
> Thanks Naoto,
> 
> Updated.
> 
> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.04/
> 
> Regards,
> Nishit Jain
> On 27-11-2018 20:55, naoto.sato at oracle.com wrote:
>> Hi Nishit,
>>
>> On 11/26/18 11:11 PM, Nishit Jain wrote:
>>> Hi Naoto,
>>>
>>> On 26-11-2018 21:01, naoto.sato at oracle.com wrote:
>>>> Hi Nishit,
>>>>
>>>> On 11/26/18 12:41 AM, Nishit Jain wrote:
>>>>> Hi Naoto,
>>>>>
>>>>> To add to my previous mail comment, the DecimalFormat spec also 
>>>>> says that
>>>>>
>>>>> /*"DecimalFormat can be instructed to format and parse scientific 
>>>>> notation only via a pattern; there is currently no factory method 
>>>>> that creates a scientific notation format. In a pattern, the 
>>>>> exponent character immediately followed by one or more digit 
>>>>> characters indicates scientific notation. "
>>>>>
>>>>> */That is, exponent formatting and parsing is instructed only via a 
>>>>> scientific notation pattern and I think should not be there with 
>>>>> *general number* formatting.
>>>>
>>>> I am not sure the quoted sentence should be interpreted that way. My 
>>>> understanding is that the section means there is no public 
>>>> NumberFormat.getScientificInstance() method (cf. line 601 at 
>>>> NumberFormat.java), so that users will have to use 'E' in their 
>>>> pattern string.
>>>>
>>>> Anyway, my point is that if you prefer to treat the scientific 
>>>> notation differently between DecimalFormat and CompactDecimalFormat, 
>>>> then it will need to be clarified in the spec. Personally I agree 
>>>> that it is not practical to interpret E in the CNF.
>>> OK. If it is better to specify the parsing behavior w.r.t. the 
>>> exponential numbers, I have added a statement in the parse() API.
>>>
>>> */"CompactNumberFormat parse does not allow parsing exponential 
>>> number strings. For example, parsing a string "1.05E4K" in US locale 
>>> breaks at character 'E' and returns 1.05."/*
>>
>> That's better. I'd replace "exponential number strings" with 
>> "scientific notations." You may want to check the final wording with 
>> English natives.
>>
>>>
>>> Updated the webrev
>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.03/
>>>
>>> Will also update the CSR and refinalize it.
>>
>> Thanks.
>>
>> Naoto
>>
>>>
>>> Regards,
>>> Nishit Jain
>>>>
>>>> Naoto
>>>>
>>>>>
>>>>> Updated webrev based on the other comments
>>>>>
>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/
>>>>>
>>>>>  > Some more comments (all in CompactNumberFormat.java)
>>>>>
>>>>>  > line 807: expandAffix() seems to treat localizable special 
>>>>> pattern characters, but currently the implementation only cares for 
>>>>> the minus sign. Should other localizable pattern chars be taken 
>>>>> care of, such as percent sign?
>>>>> - Other special characters like '%' percent sign are not allowed as 
>>>>> per CNF compact pattern spec
>>>>>
>>>>>  > line 869, 888: Define what -1 means as a ret value.
>>>>> - OK.
>>>>>
>>>>>  > line 897: iterMultiplier be better all capitalized as it is a 
>>>>> constant. And it could be statically defined in the class to be 
>>>>> shared with other locations that use "10" for arithmetic operation.
>>>>> - OK, made it static final and renamed it as RANGE_MULTIPLIER
>>>>>
>>>>>  > line 1531: Any possibility this could lead to divide-by-zero?
>>>>> - None which I am aware of, unless you are pointing to the issue 
>>>>> like JDK-8211161, which we know is not an issue.
>>>>>
>>>>> Regards,
>>>>> Nishit Jain
>>>>> On 23-11-2018 15:55, Nishit Jain wrote:
>>>>>> Hi Naoto,
>>>>>>
>>>>>> > I think DecimalFormat and CNF should behave the same, ie. 'E' 
>>>>>> should be treated as the exponent without a quote.
>>>>>>
>>>>>> Personally I don't think that the exponential parsing should be 
>>>>>> supported by CompactNumberFormat, because the objective of compact 
>>>>>> numbers is to represent numbers in short form. So, parsing of 
>>>>>> number format like "1.05E4K" should not be expected from 
>>>>>> CompactNumberFormat, I am even doubtful that such forms 
>>>>>> ("1.05E4K") are used anywhere where exponential and compact form 
>>>>>> are together used. If formatting and parsing of exponential 
>>>>>> numbers are needed it should be done by DecimalFormat scientific 
>>>>>> instance *not *with the general number instance.So, I don't think 
>>>>>> that we should allow parsing of exponential numbers.Comments welcome.
>>>>>>
>>>>>> Regards,
>>>>>> Nishit Jain
>>>>>> On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
>>>>>>> Hi Nishit,
>>>>>>>
>>>>>>> On 11/21/18 12:53 AM, Nishit Jain wrote:
>>>>>>>> Hi Naoto,
>>>>>>>>
>>>>>>>> Updated the webrev based on suggestions
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>>>>>>>
>>>>>>>> Changes made:
>>>>>>>> - Replaced List<String> with String[] to be added to the the 
>>>>>>>> resource bundles
>>>>>>>
>>>>>>> Good.
>>>>>>>
>>>>>>>> - refactored DecimalFormat.subparse() to be used by the 
>>>>>>>> CNF.parse(), to reduce code duplication.
>>>>>>>
>>>>>>> I presume CNF is calling package-private methods in DF to share 
>>>>>>> the same code. Some comments noting the sharing would be helpful.
>>>>>>>
>>>>>>>> - Also updated it with other changes as suggested in the comments
>>>>>>>
>>>>>>> Sorry I missed your question the last time:
>>>>>>> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
>>>>>>> >>> should avoid parsing exponential numbers? Or, should 
>>>>>>> CNF.parse() be
>>>>>>> >>> modified to be consistent with DecimalFormat.parse() in this 
>>>>>>> aspect?
>>>>>>>
>>>>>>> I think DecimalFormat and CNF should behave the same, ie. 'E' 
>>>>>>> should be treated as the exponent without a quote.
>>>>>>>
>>>>>>> Some more comments (all in CompactNumberFormat.java)
>>>>>>>
>>>>>>> line 807: expandAffix() seems to treat localizable special 
>>>>>>> pattern characters, but currently the implementation only cares 
>>>>>>> for the minus sign. Should other localizable pattern chars be 
>>>>>>> taken care of, such as percent sign?
>>>>>>>
>>>>>>> line 869, 888: Define what -1 means as a ret value.
>>>>>>>
>>>>>>> line 897: iterMultiplier be better all capitalized as it is a 
>>>>>>> constant. And it could be statically defined in the class to be 
>>>>>>> shared with other locations that use "10" for arithmetic operation.
>>>>>>>
>>>>>>> line 1531: Any possibility this could lead to divide-by-zero?
>>>>>>>
>>>>>>> Naoto
>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nishit Jain
>>>>>>>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>>>>>>>> Hi Nishit,
>>>>>>>>>
>>>>>>>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>>>>>>>> Hi Naoto,
>>>>>>>>>>
>>>>>>>>>> Please check my comments inline.
>>>>>>>>>>
>>>>>>>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>>>>>>>> Hi Nishit,
>>>>>>>>>>>
>>>>>>>>>>> Here are my comments:
>>>>>>>>>>>
>>>>>>>>>>> - CLDRConverter: As the compact pattern no more employs 
>>>>>>>>>>> List<String>, can we eliminate stringListEntry/Element, and 
>>>>>>>>>>> use Array equivalent instead?
>>>>>>>>>> Since the CNF design does not put any limit on the size of 
>>>>>>>>>> compact pattern, so at the time of parsing the CLDR xmls using 
>>>>>>>>>> SAX parser, it becomes difficult to identify the size of array 
>>>>>>>>>> when the parent element of compact pattern is encountered, so 
>>>>>>>>>> I think it is better to keep the List<String> while extracting 
>>>>>>>>>> the resources.
>>>>>>>>>
>>>>>>>>> OK. However I'd not keep the List<String> format on generating 
>>>>>>>>> the resource bundle, as there is no reason to introduce yet 
>>>>>>>>> another bundle format other than the existing array of String.
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> - CompactNumberFormat.java
>>>>>>>>>>>
>>>>>>>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>>>>>>>> OK
>>>>>>>>>>>
>>>>>>>>>>> line 268: The link points to 
>>>>>>>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>>>>>>>> OK. Changed it at line 165 also.
>>>>>>>>>>>
>>>>>>>>>>> line 855: no need to do toString(). length() can detect 
>>>>>>>>>>> whether it's empty or not.
>>>>>>>>>>>
>>>>>>>>>>> line 884: "Overloaded method" reads odd here. I'd prefer 
>>>>>>>>>>> specializing in the "given number" into either long or 
>>>>>>>>>>> biginteger.
>>>>>>>>>> OK
>>>>>>>>>>>
>>>>>>>>>>> line 1500: subparseNumber() pretty much shares the same code 
>>>>>>>>>>> with DecimalFormat.subparse(). can they be merged?
>>>>>>>>>> The existing CNF.subParseNumber differs in the way 
>>>>>>>>>> parseIntegerOnly is handled, DecimalFormat.parse()/subparse() 
>>>>>>>>>> behaviour is unpredictable with parseIntegeronly = true when 
>>>>>>>>>> multipliers are involved (Please see JDK-8199223).
>>>>>>>>>>
>>>>>>>>>> Also, I had thought that the CNF.parse()/subparseNumber() 
>>>>>>>>>> should *not *parse the exponential notation e.g. while parsing 
>>>>>>>>>> "1.05E4K" the parsing should break at 'E' and returns 1.05, 
>>>>>>>>>> because 'E' should be considered as unparseable character for 
>>>>>>>>>> general number format pattern or compact number pattern, but 
>>>>>>>>>> this is not the case with DecimalFormat.parse(). The below 
>>>>>>>>>> DecimalFormat general number format instance
>>>>>>>>>>
>>>>>>>>>> NumberFormat nf = NumberFormat.getNumberInstance();
>>>>>>>>>> nf.parse("1.05E4")
>>>>>>>>>>
>>>>>>>>>> Successfully parse the string and returns 10500. The same 
>>>>>>>>>> behaviour is there with other DecimalFormat instances also 
>>>>>>>>>> e.g. currency instance.
>>>>>>>>>>
>>>>>>>>>> Do you think this is an issue with DecimalFormat.parse() and 
>>>>>>>>>> CNF should avoid parsing exponential numbers? Or, should 
>>>>>>>>>> CNF.parse() be modified to be consistent with 
>>>>>>>>>> DecimalFormat.parse() in this aspect?
>>>>>>>>>
>>>>>>>>> No, I understand there are differences. But I see a lot of 
>>>>>>>>> duplicated piece of code which I would like to eliminate.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply 
>>>>>>>>>>> calls super. No need to override them.
>>>>>>>>>> Since setters are overridden, I think that it is better to 
>>>>>>>>>> override getters also (even if they are just calling super and 
>>>>>>>>>> have same javadoc) to keep them at same level. But, if you see 
>>>>>>>>>> no point in keeping them in CNF, I will remove them. Does that 
>>>>>>>>>> need CSR change?
>>>>>>>>>
>>>>>>>>> I don't see any point for override. I don't think there needs a 
>>>>>>>>> CSR, but better ask Joe about it.
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> line 2231: You need to test the type before cast. Otherwise 
>>>>>>>>>>> ClassCastException may be thrown.
>>>>>>>>>> The type is checked in the superclass equals method getClass() 
>>>>>>>>>> != obj.getClass(), so I think there is no need to check the 
>>>>>>>>>> type here.
>>>>>>>>>
>>>>>>>>> OK.
>>>>>>>>>
>>>>>>>>> Naoto
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nishit Jain
>>>>>>>>>>>
>>>>>>>>>>> Naoto
>>>>>>>>>>>
>>>>>>>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Please review this non trivial feature addition to 
>>>>>>>>>>>> NumberFormat API.
>>>>>>>>>>>>
>>>>>>>>>>>> The existing NumberFormat API provides locale based support 
>>>>>>>>>>>> for formatting and parsing numbers which includes formatting 
>>>>>>>>>>>> decimal, percent, currency etc, but the support for 
>>>>>>>>>>>> formatting a number into a human readable or compact form is 
>>>>>>>>>>>> missing. This RFE adds that feature to format a decimal 
>>>>>>>>>>>> number in a compact format (e.g. 1000 -> 1K, 1000000 -> 1M 
>>>>>>>>>>>> in en_US locale) , which is useful for the environment where 
>>>>>>>>>>>> display space is limited, so that the formatted string can 
>>>>>>>>>>>> be displayed in that limited space. It is defined by LDML's 
>>>>>>>>>>>> specification for Compact Number Formats.
>>>>>>>>>>>>
>>>>>>>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>>>>>>>> Webrev: 
>>>>>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>>>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>>>>>>>
>>>>>>>>>>>> Request to please help review the the change.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Nishit Jain
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>