<i18n dev> RFR 8177552: Compact Number Formatting support
naoto.sato at oracle.com
naoto.sato at oracle.com
Wed Nov 28 14:02:21 UTC 2018
Looks good to me.
Naoto
On 11/28/18 3:46 AM, Nishit Jain wrote:
> Thanks Naoto,
>
> Updated.
>
> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.04/
>
> Regards,
> Nishit Jain
> On 27-11-2018 20:55, naoto.sato at oracle.com wrote:
>> Hi Nishit,
>>
>> On 11/26/18 11:11 PM, Nishit Jain wrote:
>>> Hi Naoto,
>>>
>>> On 26-11-2018 21:01, naoto.sato at oracle.com wrote:
>>>> Hi Nishit,
>>>>
>>>> On 11/26/18 12:41 AM, Nishit Jain wrote:
>>>>> Hi Naoto,
>>>>>
>>>>> To add to my previous mail comment, the DecimalFormat spec also
>>>>> says that
>>>>>
>>>>> /*"DecimalFormat can be instructed to format and parse scientific
>>>>> notation only via a pattern; there is currently no factory method
>>>>> that creates a scientific notation format. In a pattern, the
>>>>> exponent character immediately followed by one or more digit
>>>>> characters indicates scientific notation. "
>>>>>
>>>>> */That is, exponent formatting and parsing is instructed only via a
>>>>> scientific notation pattern and I think should not be there with
>>>>> *general number* formatting.
>>>>
>>>> I am not sure the quoted sentence should be interpreted that way. My
>>>> understanding is that the section means there is no public
>>>> NumberFormat.getScientificInstance() method (cf. line 601 at
>>>> NumberFormat.java), so that users will have to use 'E' in their
>>>> pattern string.
>>>>
>>>> Anyway, my point is that if you prefer to treat the scientific
>>>> notation differently between DecimalFormat and CompactDecimalFormat,
>>>> then it will need to be clarified in the spec. Personally I agree
>>>> that it is not practical to interpret E in the CNF.
>>> OK. If it is better to specify the parsing behavior w.r.t. the
>>> exponential numbers, I have added a statement in the parse() API.
>>>
>>> */"CompactNumberFormat parse does not allow parsing exponential
>>> number strings. For example, parsing a string "1.05E4K" in US locale
>>> breaks at character 'E' and returns 1.05."/*
>>
>> That's better. I'd replace "exponential number strings" with
>> "scientific notations." You may want to check the final wording with
>> English natives.
>>
>>>
>>> Updated the webrev
>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.03/
>>>
>>> Will also update the CSR and refinalize it.
>>
>> Thanks.
>>
>> Naoto
>>
>>>
>>> Regards,
>>> Nishit Jain
>>>>
>>>> Naoto
>>>>
>>>>>
>>>>> Updated webrev based on the other comments
>>>>>
>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/
>>>>>
>>>>> > Some more comments (all in CompactNumberFormat.java)
>>>>>
>>>>> > line 807: expandAffix() seems to treat localizable special
>>>>> pattern characters, but currently the implementation only cares for
>>>>> the minus sign. Should other localizable pattern chars be taken
>>>>> care of, such as percent sign?
>>>>> - Other special characters like '%' percent sign are not allowed as
>>>>> per CNF compact pattern spec
>>>>>
>>>>> > line 869, 888: Define what -1 means as a ret value.
>>>>> - OK.
>>>>>
>>>>> > line 897: iterMultiplier be better all capitalized as it is a
>>>>> constant. And it could be statically defined in the class to be
>>>>> shared with other locations that use "10" for arithmetic operation.
>>>>> - OK, made it static final and renamed it as RANGE_MULTIPLIER
>>>>>
>>>>> > line 1531: Any possibility this could lead to divide-by-zero?
>>>>> - None which I am aware of, unless you are pointing to the issue
>>>>> like JDK-8211161, which we know is not an issue.
>>>>>
>>>>> Regards,
>>>>> Nishit Jain
>>>>> On 23-11-2018 15:55, Nishit Jain wrote:
>>>>>> Hi Naoto,
>>>>>>
>>>>>> > I think DecimalFormat and CNF should behave the same, ie. 'E'
>>>>>> should be treated as the exponent without a quote.
>>>>>>
>>>>>> Personally I don't think that the exponential parsing should be
>>>>>> supported by CompactNumberFormat, because the objective of compact
>>>>>> numbers is to represent numbers in short form. So, parsing of
>>>>>> number format like "1.05E4K" should not be expected from
>>>>>> CompactNumberFormat, I am even doubtful that such forms
>>>>>> ("1.05E4K") are used anywhere where exponential and compact form
>>>>>> are together used. If formatting and parsing of exponential
>>>>>> numbers are needed it should be done by DecimalFormat scientific
>>>>>> instance *not *with the general number instance.So, I don't think
>>>>>> that we should allow parsing of exponential numbers.Comments welcome.
>>>>>>
>>>>>> Regards,
>>>>>> Nishit Jain
>>>>>> On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
>>>>>>> Hi Nishit,
>>>>>>>
>>>>>>> On 11/21/18 12:53 AM, Nishit Jain wrote:
>>>>>>>> Hi Naoto,
>>>>>>>>
>>>>>>>> Updated the webrev based on suggestions
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>>>>>>>
>>>>>>>> Changes made:
>>>>>>>> - Replaced List<String> with String[] to be added to the the
>>>>>>>> resource bundles
>>>>>>>
>>>>>>> Good.
>>>>>>>
>>>>>>>> - refactored DecimalFormat.subparse() to be used by the
>>>>>>>> CNF.parse(), to reduce code duplication.
>>>>>>>
>>>>>>> I presume CNF is calling package-private methods in DF to share
>>>>>>> the same code. Some comments noting the sharing would be helpful.
>>>>>>>
>>>>>>>> - Also updated it with other changes as suggested in the comments
>>>>>>>
>>>>>>> Sorry I missed your question the last time:
>>>>>>> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
>>>>>>> >>> should avoid parsing exponential numbers? Or, should
>>>>>>> CNF.parse() be
>>>>>>> >>> modified to be consistent with DecimalFormat.parse() in this
>>>>>>> aspect?
>>>>>>>
>>>>>>> I think DecimalFormat and CNF should behave the same, ie. 'E'
>>>>>>> should be treated as the exponent without a quote.
>>>>>>>
>>>>>>> Some more comments (all in CompactNumberFormat.java)
>>>>>>>
>>>>>>> line 807: expandAffix() seems to treat localizable special
>>>>>>> pattern characters, but currently the implementation only cares
>>>>>>> for the minus sign. Should other localizable pattern chars be
>>>>>>> taken care of, such as percent sign?
>>>>>>>
>>>>>>> line 869, 888: Define what -1 means as a ret value.
>>>>>>>
>>>>>>> line 897: iterMultiplier be better all capitalized as it is a
>>>>>>> constant. And it could be statically defined in the class to be
>>>>>>> shared with other locations that use "10" for arithmetic operation.
>>>>>>>
>>>>>>> line 1531: Any possibility this could lead to divide-by-zero?
>>>>>>>
>>>>>>> Naoto
>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nishit Jain
>>>>>>>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>>>>>>>> Hi Nishit,
>>>>>>>>>
>>>>>>>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>>>>>>>> Hi Naoto,
>>>>>>>>>>
>>>>>>>>>> Please check my comments inline.
>>>>>>>>>>
>>>>>>>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>>>>>>>> Hi Nishit,
>>>>>>>>>>>
>>>>>>>>>>> Here are my comments:
>>>>>>>>>>>
>>>>>>>>>>> - CLDRConverter: As the compact pattern no more employs
>>>>>>>>>>> List<String>, can we eliminate stringListEntry/Element, and
>>>>>>>>>>> use Array equivalent instead?
>>>>>>>>>> Since the CNF design does not put any limit on the size of
>>>>>>>>>> compact pattern, so at the time of parsing the CLDR xmls using
>>>>>>>>>> SAX parser, it becomes difficult to identify the size of array
>>>>>>>>>> when the parent element of compact pattern is encountered, so
>>>>>>>>>> I think it is better to keep the List<String> while extracting
>>>>>>>>>> the resources.
>>>>>>>>>
>>>>>>>>> OK. However I'd not keep the List<String> format on generating
>>>>>>>>> the resource bundle, as there is no reason to introduce yet
>>>>>>>>> another bundle format other than the existing array of String.
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> - CompactNumberFormat.java
>>>>>>>>>>>
>>>>>>>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>>>>>>>> OK
>>>>>>>>>>>
>>>>>>>>>>> line 268: The link points to
>>>>>>>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>>>>>>>> OK. Changed it at line 165 also.
>>>>>>>>>>>
>>>>>>>>>>> line 855: no need to do toString(). length() can detect
>>>>>>>>>>> whether it's empty or not.
>>>>>>>>>>>
>>>>>>>>>>> line 884: "Overloaded method" reads odd here. I'd prefer
>>>>>>>>>>> specializing in the "given number" into either long or
>>>>>>>>>>> biginteger.
>>>>>>>>>> OK
>>>>>>>>>>>
>>>>>>>>>>> line 1500: subparseNumber() pretty much shares the same code
>>>>>>>>>>> with DecimalFormat.subparse(). can they be merged?
>>>>>>>>>> The existing CNF.subParseNumber differs in the way
>>>>>>>>>> parseIntegerOnly is handled, DecimalFormat.parse()/subparse()
>>>>>>>>>> behaviour is unpredictable with parseIntegeronly = true when
>>>>>>>>>> multipliers are involved (Please see JDK-8199223).
>>>>>>>>>>
>>>>>>>>>> Also, I had thought that the CNF.parse()/subparseNumber()
>>>>>>>>>> should *not *parse the exponential notation e.g. while parsing
>>>>>>>>>> "1.05E4K" the parsing should break at 'E' and returns 1.05,
>>>>>>>>>> because 'E' should be considered as unparseable character for
>>>>>>>>>> general number format pattern or compact number pattern, but
>>>>>>>>>> this is not the case with DecimalFormat.parse(). The below
>>>>>>>>>> DecimalFormat general number format instance
>>>>>>>>>>
>>>>>>>>>> NumberFormat nf = NumberFormat.getNumberInstance();
>>>>>>>>>> nf.parse("1.05E4")
>>>>>>>>>>
>>>>>>>>>> Successfully parse the string and returns 10500. The same
>>>>>>>>>> behaviour is there with other DecimalFormat instances also
>>>>>>>>>> e.g. currency instance.
>>>>>>>>>>
>>>>>>>>>> Do you think this is an issue with DecimalFormat.parse() and
>>>>>>>>>> CNF should avoid parsing exponential numbers? Or, should
>>>>>>>>>> CNF.parse() be modified to be consistent with
>>>>>>>>>> DecimalFormat.parse() in this aspect?
>>>>>>>>>
>>>>>>>>> No, I understand there are differences. But I see a lot of
>>>>>>>>> duplicated piece of code which I would like to eliminate.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply
>>>>>>>>>>> calls super. No need to override them.
>>>>>>>>>> Since setters are overridden, I think that it is better to
>>>>>>>>>> override getters also (even if they are just calling super and
>>>>>>>>>> have same javadoc) to keep them at same level. But, if you see
>>>>>>>>>> no point in keeping them in CNF, I will remove them. Does that
>>>>>>>>>> need CSR change?
>>>>>>>>>
>>>>>>>>> I don't see any point for override. I don't think there needs a
>>>>>>>>> CSR, but better ask Joe about it.
>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> line 2231: You need to test the type before cast. Otherwise
>>>>>>>>>>> ClassCastException may be thrown.
>>>>>>>>>> The type is checked in the superclass equals method getClass()
>>>>>>>>>> != obj.getClass(), so I think there is no need to check the
>>>>>>>>>> type here.
>>>>>>>>>
>>>>>>>>> OK.
>>>>>>>>>
>>>>>>>>> Naoto
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nishit Jain
>>>>>>>>>>>
>>>>>>>>>>> Naoto
>>>>>>>>>>>
>>>>>>>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> Please review this non trivial feature addition to
>>>>>>>>>>>> NumberFormat API.
>>>>>>>>>>>>
>>>>>>>>>>>> The existing NumberFormat API provides locale based support
>>>>>>>>>>>> for formatting and parsing numbers which includes formatting
>>>>>>>>>>>> decimal, percent, currency etc, but the support for
>>>>>>>>>>>> formatting a number into a human readable or compact form is
>>>>>>>>>>>> missing. This RFE adds that feature to format a decimal
>>>>>>>>>>>> number in a compact format (e.g. 1000 -> 1K, 1000000 -> 1M
>>>>>>>>>>>> in en_US locale) , which is useful for the environment where
>>>>>>>>>>>> display space is limited, so that the formatted string can
>>>>>>>>>>>> be displayed in that limited space. It is defined by LDML's
>>>>>>>>>>>> specification for Compact Number Formats.
>>>>>>>>>>>>
>>>>>>>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>>>>>>>> Webrev:
>>>>>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>>>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>>>>>>>
>>>>>>>>>>>> Request to please help review the the change.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Nishit Jain
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>
More information about the core-libs-dev
mailing list