<i18n dev> RFR 8177552: Compact Number Formatting support

Nishit Jain nishit.jain at oracle.com
Tue Nov 27 07:11:55 UTC 2018


Hi Naoto,

On 26-11-2018 21:01, naoto.sato at oracle.com wrote:
> Hi Nishit,
>
> On 11/26/18 12:41 AM, Nishit Jain wrote:
>> Hi Naoto,
>>
>> To add to my previous mail comment, the DecimalFormat spec also says 
>> that
>>
>> /*"DecimalFormat can be instructed to format and parse scientific 
>> notation only via a pattern; there is currently no factory method 
>> that creates a scientific notation format. In a pattern, the exponent 
>> character immediately followed by one or more digit characters 
>> indicates scientific notation. "
>>
>> */That is, exponent formatting and parsing is instructed only via a 
>> scientific notation pattern and I think should not be there with 
>> *general number* formatting.
>
> I am not sure the quoted sentence should be interpreted that way. My 
> understanding is that the section means there is no public 
> NumberFormat.getScientificInstance() method (cf. line 601 at 
> NumberFormat.java), so that users will have to use 'E' in their 
> pattern string.
>
> Anyway, my point is that if you prefer to treat the scientific 
> notation differently between DecimalFormat and CompactDecimalFormat, 
> then it will need to be clarified in the spec. Personally I agree that 
> it is not practical to interpret E in the CNF.
OK. If it is better to specify the parsing behavior w.r.t. the 
exponential numbers, I have added a statement in the parse() API.

*/"CompactNumberFormat parse does not allow parsing exponential number 
strings. For example, parsing a string "1.05E4K" in US locale breaks at 
character 'E' and returns 1.05."/*

Updated the webrev
http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.03/

Will also update the CSR and refinalize it.

Regards,
Nishit Jain
>
> Naoto
>
>>
>> Updated webrev based on the other comments
>>
>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/
>>
>>  > Some more comments (all in CompactNumberFormat.java)
>>
>>  > line 807: expandAffix() seems to treat localizable special pattern 
>> characters, but currently the implementation only cares for the minus 
>> sign. Should other localizable pattern chars be taken care of, such 
>> as percent sign?
>> - Other special characters like '%' percent sign are not allowed as 
>> per CNF compact pattern spec
>>
>>  > line 869, 888: Define what -1 means as a ret value.
>> - OK.
>>
>>  > line 897: iterMultiplier be better all capitalized as it is a 
>> constant. And it could be statically defined in the class to be 
>> shared with other locations that use "10" for arithmetic operation.
>> - OK, made it static final and renamed it as RANGE_MULTIPLIER
>>
>>  > line 1531: Any possibility this could lead to divide-by-zero?
>> - None which I am aware of, unless you are pointing to the issue like 
>> JDK-8211161, which we know is not an issue.
>>
>> Regards,
>> Nishit Jain
>> On 23-11-2018 15:55, Nishit Jain wrote:
>>> Hi Naoto,
>>>
>>> > I think DecimalFormat and CNF should behave the same, ie. 'E' 
>>> should be treated as the exponent without a quote.
>>>
>>> Personally I don't think that the exponential parsing should be 
>>> supported by CompactNumberFormat, because the objective of compact 
>>> numbers is to represent numbers in short form. So, parsing of number 
>>> format like "1.05E4K" should not be expected from 
>>> CompactNumberFormat, I am even doubtful that such forms ("1.05E4K") 
>>> are used anywhere where exponential and compact form are together 
>>> used. If formatting and parsing of exponential numbers are needed it 
>>> should be done by DecimalFormat scientific instance *not *with the 
>>> general number instance.So, I don't think that we should allow 
>>> parsing of exponential numbers.Comments welcome.
>>>
>>> Regards,
>>> Nishit Jain
>>> On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
>>>> Hi Nishit,
>>>>
>>>> On 11/21/18 12:53 AM, Nishit Jain wrote:
>>>>> Hi Naoto,
>>>>>
>>>>> Updated the webrev based on suggestions
>>>>>
>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>>>>
>>>>> Changes made:
>>>>> - Replaced List<String> with String[] to be added to the the 
>>>>> resource bundles
>>>>
>>>> Good.
>>>>
>>>>> - refactored DecimalFormat.subparse() to be used by the 
>>>>> CNF.parse(), to reduce code duplication.
>>>>
>>>> I presume CNF is calling package-private methods in DF to share the 
>>>> same code. Some comments noting the sharing would be helpful.
>>>>
>>>>> - Also updated it with other changes as suggested in the comments
>>>>
>>>> Sorry I missed your question the last time:
>>>> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
>>>> >>> should avoid parsing exponential numbers? Or, should 
>>>> CNF.parse() be
>>>> >>> modified to be consistent with DecimalFormat.parse() in this 
>>>> aspect?
>>>>
>>>> I think DecimalFormat and CNF should behave the same, ie. 'E' 
>>>> should be treated as the exponent without a quote.
>>>>
>>>> Some more comments (all in CompactNumberFormat.java)
>>>>
>>>> line 807: expandAffix() seems to treat localizable special pattern 
>>>> characters, but currently the implementation only cares for the 
>>>> minus sign. Should other localizable pattern chars be taken care 
>>>> of, such as percent sign?
>>>>
>>>> line 869, 888: Define what -1 means as a ret value.
>>>>
>>>> line 897: iterMultiplier be better all capitalized as it is a 
>>>> constant. And it could be statically defined in the class to be 
>>>> shared with other locations that use "10" for arithmetic operation.
>>>>
>>>> line 1531: Any possibility this could lead to divide-by-zero?
>>>>
>>>> Naoto
>>>>
>>>>>
>>>>> Regards,
>>>>> Nishit Jain
>>>>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>>>>> Hi Nishit,
>>>>>>
>>>>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>>>>> Hi Naoto,
>>>>>>>
>>>>>>> Please check my comments inline.
>>>>>>>
>>>>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>>>>> Hi Nishit,
>>>>>>>>
>>>>>>>> Here are my comments:
>>>>>>>>
>>>>>>>> - CLDRConverter: As the compact pattern no more employs 
>>>>>>>> List<String>, can we eliminate stringListEntry/Element, and use 
>>>>>>>> Array equivalent instead?
>>>>>>> Since the CNF design does not put any limit on the size of 
>>>>>>> compact pattern, so at the time of parsing the CLDR xmls using 
>>>>>>> SAX parser, it becomes difficult to identify the size of array 
>>>>>>> when the parent element of compact pattern is encountered, so I 
>>>>>>> think it is better to keep the List<String> while extracting the 
>>>>>>> resources.
>>>>>>
>>>>>> OK. However I'd not keep the List<String> format on generating 
>>>>>> the resource bundle, as there is no reason to introduce yet 
>>>>>> another bundle format other than the existing array of String.
>>>>>>
>>>>>>>>
>>>>>>>> - CompactNumberFormat.java
>>>>>>>>
>>>>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>>>>> OK
>>>>>>>>
>>>>>>>> line 268: The link points to 
>>>>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>>>>> OK. Changed it at line 165 also.
>>>>>>>>
>>>>>>>> line 855: no need to do toString(). length() can detect whether 
>>>>>>>> it's empty or not.
>>>>>>>>
>>>>>>>> line 884: "Overloaded method" reads odd here. I'd prefer 
>>>>>>>> specializing in the "given number" into either long or biginteger.
>>>>>>> OK
>>>>>>>>
>>>>>>>> line 1500: subparseNumber() pretty much shares the same code 
>>>>>>>> with DecimalFormat.subparse(). can they be merged?
>>>>>>> The existing CNF.subParseNumber differs in the way 
>>>>>>> parseIntegerOnly is handled, DecimalFormat.parse()/subparse() 
>>>>>>> behaviour is unpredictable with parseIntegeronly = true when 
>>>>>>> multipliers are involved (Please see JDK-8199223).
>>>>>>>
>>>>>>> Also, I had thought that the CNF.parse()/subparseNumber() should 
>>>>>>> *not *parse the exponential notation e.g. while parsing 
>>>>>>> "1.05E4K" the parsing should break at 'E' and returns 1.05, 
>>>>>>> because 'E' should be considered as unparseable character for 
>>>>>>> general number format pattern or compact number pattern, but 
>>>>>>> this is not the case with DecimalFormat.parse(). The below 
>>>>>>> DecimalFormat general number format instance
>>>>>>>
>>>>>>> NumberFormat nf =  NumberFormat.getNumberInstance();
>>>>>>> nf.parse("1.05E4")
>>>>>>>
>>>>>>> Successfully parse the string and returns 10500. The same 
>>>>>>> behaviour is there with other DecimalFormat instances also e.g. 
>>>>>>> currency instance.
>>>>>>>
>>>>>>> Do you think this is an issue with DecimalFormat.parse() and CNF 
>>>>>>> should avoid parsing exponential numbers? Or, should CNF.parse() 
>>>>>>> be modified to be consistent with DecimalFormat.parse() in this 
>>>>>>> aspect?
>>>>>>
>>>>>> No, I understand there are differences. But I see a lot of 
>>>>>> duplicated piece of code which I would like to eliminate.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply 
>>>>>>>> calls super. No need to override them.
>>>>>>> Since setters are overridden, I think that it is better to 
>>>>>>> override getters also (even if they are just calling super and 
>>>>>>> have same javadoc) to keep them at same level. But, if you see 
>>>>>>> no point in keeping them in CNF, I will remove them. Does that 
>>>>>>> need CSR change?
>>>>>>
>>>>>> I don't see any point for override. I don't think there needs a 
>>>>>> CSR, but better ask Joe about it.
>>>>>>
>>>>>>>>
>>>>>>>> line 2231: You need to test the type before cast. Otherwise 
>>>>>>>> ClassCastException may be thrown.
>>>>>>> The type is checked in the superclass equals method getClass() 
>>>>>>> != obj.getClass(), so I think there is no need to check the type 
>>>>>>> here.
>>>>>>
>>>>>> OK.
>>>>>>
>>>>>> Naoto
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nishit Jain
>>>>>>>>
>>>>>>>> Naoto
>>>>>>>>
>>>>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please review this non trivial feature addition to 
>>>>>>>>> NumberFormat API.
>>>>>>>>>
>>>>>>>>> The existing NumberFormat API provides locale based support 
>>>>>>>>> for formatting and parsing numbers which includes formatting 
>>>>>>>>> decimal, percent, currency etc, but the support for formatting 
>>>>>>>>> a number into a human readable or compact form is missing. 
>>>>>>>>> This RFE adds that feature to format a decimal number in a 
>>>>>>>>> compact format (e.g. 1000 -> 1K, 1000000 -> 1M in en_US 
>>>>>>>>> locale) , which is useful for the environment where display 
>>>>>>>>> space is limited, so that the formatted string can be 
>>>>>>>>> displayed in that limited space. It is defined by LDML's 
>>>>>>>>> specification for Compact Number Formats.
>>>>>>>>>
>>>>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>>>>> Webrev: 
>>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>>>>
>>>>>>>>> Request to please help review the the change.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nishit Jain
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>



More information about the i18n-dev mailing list