<i18n dev> RFR 8177552: Compact Number Formatting support

Mon Nov 26 08:41:50 UTC 2018

Hi Naoto,

To add to my previous mail comment, the DecimalFormat spec also says that

/*"DecimalFormat can be instructed to format and parse scientific 
notation only via a pattern; there is currently no factory method that 
creates a scientific notation format. In a pattern, the exponent 
character immediately followed by one or more digit characters indicates 
scientific notation. "

*/That is, exponent formatting and parsing is instructed only via a 
scientific notation pattern and I think should not be there with 
*general number* formatting.

Updated webrev based on the other comments

http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/

 > Some more comments (all in CompactNumberFormat.java)

 > line 807: expandAffix() seems to treat localizable special pattern 
characters, but currently the implementation only cares for the minus 
sign. Should other localizable pattern chars be taken care of, such as 
percent sign?
- Other special characters like '%' percent sign are not allowed as per 
CNF compact pattern spec

 > line 869, 888: Define what -1 means as a ret value.
- OK.

 > line 897: iterMultiplier be better all capitalized as it is a 
constant. And it could be statically defined in the class to be shared 
with other locations that use "10" for arithmetic operation.
- OK, made it static final and renamed it as RANGE_MULTIPLIER

 > line 1531: Any possibility this could lead to divide-by-zero?
- None which I am aware of, unless you are pointing to the issue like 
JDK-8211161, which we know is not an issue.

Regards,
Nishit Jain
On 23-11-2018 15:55, Nishit Jain wrote:
> Hi Naoto,
>
> > I think DecimalFormat and CNF should behave the same, ie. 'E' should 
> be treated as the exponent without a quote.
>
> Personally I don't think that the exponential parsing should be 
> supported by CompactNumberFormat, because the objective of compact 
> numbers is to represent numbers in short form. So, parsing of number 
> format like "1.05E4K" should not be expected from CompactNumberFormat, 
> I am even doubtful that such forms ("1.05E4K") are used anywhere where 
> exponential and compact form are together used. If formatting and 
> parsing of exponential numbers are needed it should be done by 
> DecimalFormat scientific instance *not *with the general number 
> instance.So, I don't think that we should allow parsing of exponential 
> numbers.Comments welcome.
>
> Regards,
> Nishit Jain
> On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
>> Hi Nishit,
>>
>> On 11/21/18 12:53 AM, Nishit Jain wrote:
>>> Hi Naoto,
>>>
>>> Updated the webrev based on suggestions
>>>
>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>>
>>> Changes made:
>>> - Replaced List<String> with String[] to be added to the the 
>>> resource bundles
>>
>> Good.
>>
>>> - refactored DecimalFormat.subparse() to be used by the CNF.parse(), 
>>> to reduce code duplication.
>>
>> I presume CNF is calling package-private methods in DF to share the 
>> same code. Some comments noting the sharing would be helpful.
>>
>>> - Also updated it with other changes as suggested in the comments
>>
>> Sorry I missed your question the last time:
>> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
>> >>> should avoid parsing exponential numbers? Or, should CNF.parse() be
>> >>> modified to be consistent with DecimalFormat.parse() in this aspect?
>>
>> I think DecimalFormat and CNF should behave the same, ie. 'E' should 
>> be treated as the exponent without a quote.
>>
>> Some more comments (all in CompactNumberFormat.java)
>>
>> line 807: expandAffix() seems to treat localizable special pattern 
>> characters, but currently the implementation only cares for the minus 
>> sign. Should other localizable pattern chars be taken care of, such 
>> as percent sign?
>>
>> line 869, 888: Define what -1 means as a ret value.
>>
>> line 897: iterMultiplier be better all capitalized as it is a 
>> constant. And it could be statically defined in the class to be 
>> shared with other locations that use "10" for arithmetic operation.
>>
>> line 1531: Any possibility this could lead to divide-by-zero?
>>
>> Naoto
>>
>>>
>>> Regards,
>>> Nishit Jain
>>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>>> Hi Nishit,
>>>>
>>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>>> Hi Naoto,
>>>>>
>>>>> Please check my comments inline.
>>>>>
>>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>>> Hi Nishit,
>>>>>>
>>>>>> Here are my comments:
>>>>>>
>>>>>> - CLDRConverter: As the compact pattern no more employs 
>>>>>> List<String>, can we eliminate stringListEntry/Element, and use 
>>>>>> Array equivalent instead?
>>>>> Since the CNF design does not put any limit on the size of compact 
>>>>> pattern, so at the time of parsing the CLDR xmls using SAX parser, 
>>>>> it becomes difficult to identify the size of array when the parent 
>>>>> element of compact pattern is encountered, so I think it is better 
>>>>> to keep the List<String> while extracting the resources.
>>>>
>>>> OK. However I'd not keep the List<String> format on generating the 
>>>> resource bundle, as there is no reason to introduce yet another 
>>>> bundle format other than the existing array of String.
>>>>
>>>>>>
>>>>>> - CompactNumberFormat.java
>>>>>>
>>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>>> OK
>>>>>>
>>>>>> line 268: The link points to 
>>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>>> OK. Changed it at line 165 also.
>>>>>>
>>>>>> line 855: no need to do toString(). length() can detect whether 
>>>>>> it's empty or not.
>>>>>>
>>>>>> line 884: "Overloaded method" reads odd here. I'd prefer 
>>>>>> specializing in the "given number" into either long or biginteger.
>>>>> OK
>>>>>>
>>>>>> line 1500: subparseNumber() pretty much shares the same code with 
>>>>>> DecimalFormat.subparse(). can they be merged?
>>>>> The existing CNF.subParseNumber differs in the way 
>>>>> parseIntegerOnly is handled, DecimalFormat.parse()/subparse() 
>>>>> behaviour is unpredictable with parseIntegeronly = true when 
>>>>> multipliers are involved (Please see JDK-8199223).
>>>>>
>>>>> Also, I had thought that the CNF.parse()/subparseNumber() should 
>>>>> *not *parse the exponential notation e.g. while parsing "1.05E4K" 
>>>>> the parsing should break at 'E' and returns 1.05, because 'E' 
>>>>> should be considered as unparseable character for general number 
>>>>> format pattern or compact number pattern, but this is not the case 
>>>>> with DecimalFormat.parse(). The below DecimalFormat general number 
>>>>> format instance
>>>>>
>>>>> NumberFormat nf =  NumberFormat.getNumberInstance();
>>>>> nf.parse("1.05E4")
>>>>>
>>>>> Successfully parse the string and returns 10500. The same 
>>>>> behaviour is there with other DecimalFormat instances also e.g. 
>>>>> currency instance.
>>>>>
>>>>> Do you think this is an issue with DecimalFormat.parse() and CNF 
>>>>> should avoid parsing exponential numbers? Or, should CNF.parse() 
>>>>> be modified to be consistent with DecimalFormat.parse() in this 
>>>>> aspect?
>>>>
>>>> No, I understand there are differences. But I see a lot of 
>>>> duplicated piece of code which I would like to eliminate.
>>>>
>>>>>
>>>>>>
>>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply calls 
>>>>>> super. No need to override them.
>>>>> Since setters are overridden, I think that it is better to 
>>>>> override getters also (even if they are just calling super and 
>>>>> have same javadoc) to keep them at same level. But, if you see no 
>>>>> point in keeping them in CNF, I will remove them. Does that need 
>>>>> CSR change?
>>>>
>>>> I don't see any point for override. I don't think there needs a 
>>>> CSR, but better ask Joe about it.
>>>>
>>>>>>
>>>>>> line 2231: You need to test the type before cast. Otherwise 
>>>>>> ClassCastException may be thrown.
>>>>> The type is checked in the superclass equals method getClass() != 
>>>>> obj.getClass(), so I think there is no need to check the type here.
>>>>
>>>> OK.
>>>>
>>>> Naoto
>>>>
>>>>>
>>>>> Regards,
>>>>> Nishit Jain
>>>>>>
>>>>>> Naoto
>>>>>>
>>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please review this non trivial feature addition to NumberFormat 
>>>>>>> API.
>>>>>>>
>>>>>>> The existing NumberFormat API provides locale based support for 
>>>>>>> formatting and parsing numbers which includes formatting 
>>>>>>> decimal, percent, currency etc, but the support for formatting a 
>>>>>>> number into a human readable or compact form is missing. This 
>>>>>>> RFE adds that feature to format a decimal number in a compact 
>>>>>>> format (e.g. 1000 -> 1K, 1000000 -> 1M in en_US locale) , which 
>>>>>>> is useful for the environment where display space is limited, so 
>>>>>>> that the formatted string can be displayed in that limited 
>>>>>>> space. It is defined by LDML's specification for Compact Number 
>>>>>>> Formats.
>>>>>>>
>>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>>> Webrev: 
>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>>
>>>>>>> Request to please help review the the change.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nishit Jain
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>