<i18n dev> RFR 8177552: Compact Number Formatting support

naoto.sato at oracle.com naoto.sato at oracle.com
Tue Nov 27 15:25:19 UTC 2018


Hi Nishit,

On 11/26/18 11:11 PM, Nishit Jain wrote:
> Hi Naoto,
> 
> On 26-11-2018 21:01, naoto.sato at oracle.com wrote:
>> Hi Nishit,
>>
>> On 11/26/18 12:41 AM, Nishit Jain wrote:
>>> Hi Naoto,
>>>
>>> To add to my previous mail comment, the DecimalFormat spec also says 
>>> that
>>>
>>> /*"DecimalFormat can be instructed to format and parse scientific 
>>> notation only via a pattern; there is currently no factory method 
>>> that creates a scientific notation format. In a pattern, the exponent 
>>> character immediately followed by one or more digit characters 
>>> indicates scientific notation. "
>>>
>>> */That is, exponent formatting and parsing is instructed only via a 
>>> scientific notation pattern and I think should not be there with 
>>> *general number* formatting.
>>
>> I am not sure the quoted sentence should be interpreted that way. My 
>> understanding is that the section means there is no public 
>> NumberFormat.getScientificInstance() method (cf. line 601 at 
>> NumberFormat.java), so that users will have to use 'E' in their 
>> pattern string.
>>
>> Anyway, my point is that if you prefer to treat the scientific 
>> notation differently between DecimalFormat and CompactDecimalFormat, 
>> then it will need to be clarified in the spec. Personally I agree that 
>> it is not practical to interpret E in the CNF.
> OK. If it is better to specify the parsing behavior w.r.t. the 
> exponential numbers, I have added a statement in the parse() API.
> 
> */"CompactNumberFormat parse does not allow parsing exponential number 
> strings. For example, parsing a string "1.05E4K" in US locale breaks at 
> character 'E' and returns 1.05."/*

That's better. I'd replace "exponential number strings" with "scientific 
notations." You may want to check the final wording with English natives.

> 
> Updated the webrev
> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.03/
> 
> Will also update the CSR and refinalize it.

Thanks.

Naoto

> 
> Regards,
> Nishit Jain
>>
>> Naoto
>>
>>>
>>> Updated webrev based on the other comments
>>>
>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/
>>>
>>>  > Some more comments (all in CompactNumberFormat.java)
>>>
>>>  > line 807: expandAffix() seems to treat localizable special pattern 
>>> characters, but currently the implementation only cares for the minus 
>>> sign. Should other localizable pattern chars be taken care of, such 
>>> as percent sign?
>>> - Other special characters like '%' percent sign are not allowed as 
>>> per CNF compact pattern spec
>>>
>>>  > line 869, 888: Define what -1 means as a ret value.
>>> - OK.
>>>
>>>  > line 897: iterMultiplier be better all capitalized as it is a 
>>> constant. And it could be statically defined in the class to be 
>>> shared with other locations that use "10" for arithmetic operation.
>>> - OK, made it static final and renamed it as RANGE_MULTIPLIER
>>>
>>>  > line 1531: Any possibility this could lead to divide-by-zero?
>>> - None which I am aware of, unless you are pointing to the issue like 
>>> JDK-8211161, which we know is not an issue.
>>>
>>> Regards,
>>> Nishit Jain
>>> On 23-11-2018 15:55, Nishit Jain wrote:
>>>> Hi Naoto,
>>>>
>>>> > I think DecimalFormat and CNF should behave the same, ie. 'E' 
>>>> should be treated as the exponent without a quote.
>>>>
>>>> Personally I don't think that the exponential parsing should be 
>>>> supported by CompactNumberFormat, because the objective of compact 
>>>> numbers is to represent numbers in short form. So, parsing of number 
>>>> format like "1.05E4K" should not be expected from 
>>>> CompactNumberFormat, I am even doubtful that such forms ("1.05E4K") 
>>>> are used anywhere where exponential and compact form are together 
>>>> used. If formatting and parsing of exponential numbers are needed it 
>>>> should be done by DecimalFormat scientific instance *not *with the 
>>>> general number instance.So, I don't think that we should allow 
>>>> parsing of exponential numbers.Comments welcome.
>>>>
>>>> Regards,
>>>> Nishit Jain
>>>> On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
>>>>> Hi Nishit,
>>>>>
>>>>> On 11/21/18 12:53 AM, Nishit Jain wrote:
>>>>>> Hi Naoto,
>>>>>>
>>>>>> Updated the webrev based on suggestions
>>>>>>
>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>>>>>
>>>>>> Changes made:
>>>>>> - Replaced List<String> with String[] to be added to the the 
>>>>>> resource bundles
>>>>>
>>>>> Good.
>>>>>
>>>>>> - refactored DecimalFormat.subparse() to be used by the 
>>>>>> CNF.parse(), to reduce code duplication.
>>>>>
>>>>> I presume CNF is calling package-private methods in DF to share the 
>>>>> same code. Some comments noting the sharing would be helpful.
>>>>>
>>>>>> - Also updated it with other changes as suggested in the comments
>>>>>
>>>>> Sorry I missed your question the last time:
>>>>> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
>>>>> >>> should avoid parsing exponential numbers? Or, should 
>>>>> CNF.parse() be
>>>>> >>> modified to be consistent with DecimalFormat.parse() in this 
>>>>> aspect?
>>>>>
>>>>> I think DecimalFormat and CNF should behave the same, ie. 'E' 
>>>>> should be treated as the exponent without a quote.
>>>>>
>>>>> Some more comments (all in CompactNumberFormat.java)
>>>>>
>>>>> line 807: expandAffix() seems to treat localizable special pattern 
>>>>> characters, but currently the implementation only cares for the 
>>>>> minus sign. Should other localizable pattern chars be taken care 
>>>>> of, such as percent sign?
>>>>>
>>>>> line 869, 888: Define what -1 means as a ret value.
>>>>>
>>>>> line 897: iterMultiplier be better all capitalized as it is a 
>>>>> constant. And it could be statically defined in the class to be 
>>>>> shared with other locations that use "10" for arithmetic operation.
>>>>>
>>>>> line 1531: Any possibility this could lead to divide-by-zero?
>>>>>
>>>>> Naoto
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Nishit Jain
>>>>>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>>>>>> Hi Nishit,
>>>>>>>
>>>>>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>>>>>> Hi Naoto,
>>>>>>>>
>>>>>>>> Please check my comments inline.
>>>>>>>>
>>>>>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>>>>>> Hi Nishit,
>>>>>>>>>
>>>>>>>>> Here are my comments:
>>>>>>>>>
>>>>>>>>> - CLDRConverter: As the compact pattern no more employs 
>>>>>>>>> List<String>, can we eliminate stringListEntry/Element, and use 
>>>>>>>>> Array equivalent instead?
>>>>>>>> Since the CNF design does not put any limit on the size of 
>>>>>>>> compact pattern, so at the time of parsing the CLDR xmls using 
>>>>>>>> SAX parser, it becomes difficult to identify the size of array 
>>>>>>>> when the parent element of compact pattern is encountered, so I 
>>>>>>>> think it is better to keep the List<String> while extracting the 
>>>>>>>> resources.
>>>>>>>
>>>>>>> OK. However I'd not keep the List<String> format on generating 
>>>>>>> the resource bundle, as there is no reason to introduce yet 
>>>>>>> another bundle format other than the existing array of String.
>>>>>>>
>>>>>>>>>
>>>>>>>>> - CompactNumberFormat.java
>>>>>>>>>
>>>>>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>>>>>> OK
>>>>>>>>>
>>>>>>>>> line 268: The link points to 
>>>>>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>>>>>> OK. Changed it at line 165 also.
>>>>>>>>>
>>>>>>>>> line 855: no need to do toString(). length() can detect whether 
>>>>>>>>> it's empty or not.
>>>>>>>>>
>>>>>>>>> line 884: "Overloaded method" reads odd here. I'd prefer 
>>>>>>>>> specializing in the "given number" into either long or biginteger.
>>>>>>>> OK
>>>>>>>>>
>>>>>>>>> line 1500: subparseNumber() pretty much shares the same code 
>>>>>>>>> with DecimalFormat.subparse(). can they be merged?
>>>>>>>> The existing CNF.subParseNumber differs in the way 
>>>>>>>> parseIntegerOnly is handled, DecimalFormat.parse()/subparse() 
>>>>>>>> behaviour is unpredictable with parseIntegeronly = true when 
>>>>>>>> multipliers are involved (Please see JDK-8199223).
>>>>>>>>
>>>>>>>> Also, I had thought that the CNF.parse()/subparseNumber() should 
>>>>>>>> *not *parse the exponential notation e.g. while parsing 
>>>>>>>> "1.05E4K" the parsing should break at 'E' and returns 1.05, 
>>>>>>>> because 'E' should be considered as unparseable character for 
>>>>>>>> general number format pattern or compact number pattern, but 
>>>>>>>> this is not the case with DecimalFormat.parse(). The below 
>>>>>>>> DecimalFormat general number format instance
>>>>>>>>
>>>>>>>> NumberFormat nf =  NumberFormat.getNumberInstance();
>>>>>>>> nf.parse("1.05E4")
>>>>>>>>
>>>>>>>> Successfully parse the string and returns 10500. The same 
>>>>>>>> behaviour is there with other DecimalFormat instances also e.g. 
>>>>>>>> currency instance.
>>>>>>>>
>>>>>>>> Do you think this is an issue with DecimalFormat.parse() and CNF 
>>>>>>>> should avoid parsing exponential numbers? Or, should CNF.parse() 
>>>>>>>> be modified to be consistent with DecimalFormat.parse() in this 
>>>>>>>> aspect?
>>>>>>>
>>>>>>> No, I understand there are differences. But I see a lot of 
>>>>>>> duplicated piece of code which I would like to eliminate.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply 
>>>>>>>>> calls super. No need to override them.
>>>>>>>> Since setters are overridden, I think that it is better to 
>>>>>>>> override getters also (even if they are just calling super and 
>>>>>>>> have same javadoc) to keep them at same level. But, if you see 
>>>>>>>> no point in keeping them in CNF, I will remove them. Does that 
>>>>>>>> need CSR change?
>>>>>>>
>>>>>>> I don't see any point for override. I don't think there needs a 
>>>>>>> CSR, but better ask Joe about it.
>>>>>>>
>>>>>>>>>
>>>>>>>>> line 2231: You need to test the type before cast. Otherwise 
>>>>>>>>> ClassCastException may be thrown.
>>>>>>>> The type is checked in the superclass equals method getClass() 
>>>>>>>> != obj.getClass(), so I think there is no need to check the type 
>>>>>>>> here.
>>>>>>>
>>>>>>> OK.
>>>>>>>
>>>>>>> Naoto
>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nishit Jain
>>>>>>>>>
>>>>>>>>> Naoto
>>>>>>>>>
>>>>>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Please review this non trivial feature addition to 
>>>>>>>>>> NumberFormat API.
>>>>>>>>>>
>>>>>>>>>> The existing NumberFormat API provides locale based support 
>>>>>>>>>> for formatting and parsing numbers which includes formatting 
>>>>>>>>>> decimal, percent, currency etc, but the support for formatting 
>>>>>>>>>> a number into a human readable or compact form is missing. 
>>>>>>>>>> This RFE adds that feature to format a decimal number in a 
>>>>>>>>>> compact format (e.g. 1000 -> 1K, 1000000 -> 1M in en_US 
>>>>>>>>>> locale) , which is useful for the environment where display 
>>>>>>>>>> space is limited, so that the formatted string can be 
>>>>>>>>>> displayed in that limited space. It is defined by LDML's 
>>>>>>>>>> specification for Compact Number Formats.
>>>>>>>>>>
>>>>>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>>>>>> Webrev: 
>>>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>>>>>
>>>>>>>>>> Request to please help review the the change.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nishit Jain
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
> 


More information about the i18n-dev mailing list