<i18n dev> RFR 8177552: Compact Number Formatting support
Nishit Jain
nishit.jain at oracle.com
Wed Nov 28 11:46:42 UTC 2018
Thanks Naoto,
Updated.
http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.04/
Regards,
Nishit Jain
On 27-11-2018 20:55, naoto.sato at oracle.com wrote:
> Hi Nishit,
>
> On 11/26/18 11:11 PM, Nishit Jain wrote:
>> Hi Naoto,
>>
>> On 26-11-2018 21:01, naoto.sato at oracle.com wrote:
>>> Hi Nishit,
>>>
>>> On 11/26/18 12:41 AM, Nishit Jain wrote:
>>>> Hi Naoto,
>>>>
>>>> To add to my previous mail comment, the DecimalFormat spec also
>>>> says that
>>>>
>>>> /*"DecimalFormat can be instructed to format and parse scientific
>>>> notation only via a pattern; there is currently no factory method
>>>> that creates a scientific notation format. In a pattern, the
>>>> exponent character immediately followed by one or more digit
>>>> characters indicates scientific notation. "
>>>>
>>>> */That is, exponent formatting and parsing is instructed only via a
>>>> scientific notation pattern and I think should not be there with
>>>> *general number* formatting.
>>>
>>> I am not sure the quoted sentence should be interpreted that way. My
>>> understanding is that the section means there is no public
>>> NumberFormat.getScientificInstance() method (cf. line 601 at
>>> NumberFormat.java), so that users will have to use 'E' in their
>>> pattern string.
>>>
>>> Anyway, my point is that if you prefer to treat the scientific
>>> notation differently between DecimalFormat and CompactDecimalFormat,
>>> then it will need to be clarified in the spec. Personally I agree
>>> that it is not practical to interpret E in the CNF.
>> OK. If it is better to specify the parsing behavior w.r.t. the
>> exponential numbers, I have added a statement in the parse() API.
>>
>> */"CompactNumberFormat parse does not allow parsing exponential
>> number strings. For example, parsing a string "1.05E4K" in US locale
>> breaks at character 'E' and returns 1.05."/*
>
> That's better. I'd replace "exponential number strings" with
> "scientific notations." You may want to check the final wording with
> English natives.
>
>>
>> Updated the webrev
>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.03/
>>
>> Will also update the CSR and refinalize it.
>
> Thanks.
>
> Naoto
>
>>
>> Regards,
>> Nishit Jain
>>>
>>> Naoto
>>>
>>>>
>>>> Updated webrev based on the other comments
>>>>
>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/
>>>>
>>>> > Some more comments (all in CompactNumberFormat.java)
>>>>
>>>> > line 807: expandAffix() seems to treat localizable special
>>>> pattern characters, but currently the implementation only cares for
>>>> the minus sign. Should other localizable pattern chars be taken
>>>> care of, such as percent sign?
>>>> - Other special characters like '%' percent sign are not allowed as
>>>> per CNF compact pattern spec
>>>>
>>>> > line 869, 888: Define what -1 means as a ret value.
>>>> - OK.
>>>>
>>>> > line 897: iterMultiplier be better all capitalized as it is a
>>>> constant. And it could be statically defined in the class to be
>>>> shared with other locations that use "10" for arithmetic operation.
>>>> - OK, made it static final and renamed it as RANGE_MULTIPLIER
>>>>
>>>> > line 1531: Any possibility this could lead to divide-by-zero?
>>>> - None which I am aware of, unless you are pointing to the issue
>>>> like JDK-8211161, which we know is not an issue.
>>>>
>>>> Regards,
>>>> Nishit Jain
>>>> On 23-11-2018 15:55, Nishit Jain wrote:
>>>>> Hi Naoto,
>>>>>
>>>>> > I think DecimalFormat and CNF should behave the same, ie. 'E'
>>>>> should be treated as the exponent without a quote.
>>>>>
>>>>> Personally I don't think that the exponential parsing should be
>>>>> supported by CompactNumberFormat, because the objective of compact
>>>>> numbers is to represent numbers in short form. So, parsing of
>>>>> number format like "1.05E4K" should not be expected from
>>>>> CompactNumberFormat, I am even doubtful that such forms
>>>>> ("1.05E4K") are used anywhere where exponential and compact form
>>>>> are together used. If formatting and parsing of exponential
>>>>> numbers are needed it should be done by DecimalFormat scientific
>>>>> instance *not *with the general number instance.So, I don't think
>>>>> that we should allow parsing of exponential numbers.Comments welcome.
>>>>>
>>>>> Regards,
>>>>> Nishit Jain
>>>>> On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
>>>>>> Hi Nishit,
>>>>>>
>>>>>> On 11/21/18 12:53 AM, Nishit Jain wrote:
>>>>>>> Hi Naoto,
>>>>>>>
>>>>>>> Updated the webrev based on suggestions
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>>>>>>
>>>>>>> Changes made:
>>>>>>> - Replaced List<String> with String[] to be added to the the
>>>>>>> resource bundles
>>>>>>
>>>>>> Good.
>>>>>>
>>>>>>> - refactored DecimalFormat.subparse() to be used by the
>>>>>>> CNF.parse(), to reduce code duplication.
>>>>>>
>>>>>> I presume CNF is calling package-private methods in DF to share
>>>>>> the same code. Some comments noting the sharing would be helpful.
>>>>>>
>>>>>>> - Also updated it with other changes as suggested in the comments
>>>>>>
>>>>>> Sorry I missed your question the last time:
>>>>>> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
>>>>>> >>> should avoid parsing exponential numbers? Or, should
>>>>>> CNF.parse() be
>>>>>> >>> modified to be consistent with DecimalFormat.parse() in this
>>>>>> aspect?
>>>>>>
>>>>>> I think DecimalFormat and CNF should behave the same, ie. 'E'
>>>>>> should be treated as the exponent without a quote.
>>>>>>
>>>>>> Some more comments (all in CompactNumberFormat.java)
>>>>>>
>>>>>> line 807: expandAffix() seems to treat localizable special
>>>>>> pattern characters, but currently the implementation only cares
>>>>>> for the minus sign. Should other localizable pattern chars be
>>>>>> taken care of, such as percent sign?
>>>>>>
>>>>>> line 869, 888: Define what -1 means as a ret value.
>>>>>>
>>>>>> line 897: iterMultiplier be better all capitalized as it is a
>>>>>> constant. And it could be statically defined in the class to be
>>>>>> shared with other locations that use "10" for arithmetic operation.
>>>>>>
>>>>>> line 1531: Any possibility this could lead to divide-by-zero?
>>>>>>
>>>>>> Naoto
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nishit Jain
>>>>>>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>>>>>>> Hi Nishit,
>>>>>>>>
>>>>>>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>>>>>>> Hi Naoto,
>>>>>>>>>
>>>>>>>>> Please check my comments inline.
>>>>>>>>>
>>>>>>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>>>>>>> Hi Nishit,
>>>>>>>>>>
>>>>>>>>>> Here are my comments:
>>>>>>>>>>
>>>>>>>>>> - CLDRConverter: As the compact pattern no more employs
>>>>>>>>>> List<String>, can we eliminate stringListEntry/Element, and
>>>>>>>>>> use Array equivalent instead?
>>>>>>>>> Since the CNF design does not put any limit on the size of
>>>>>>>>> compact pattern, so at the time of parsing the CLDR xmls using
>>>>>>>>> SAX parser, it becomes difficult to identify the size of array
>>>>>>>>> when the parent element of compact pattern is encountered, so
>>>>>>>>> I think it is better to keep the List<String> while extracting
>>>>>>>>> the resources.
>>>>>>>>
>>>>>>>> OK. However I'd not keep the List<String> format on generating
>>>>>>>> the resource bundle, as there is no reason to introduce yet
>>>>>>>> another bundle format other than the existing array of String.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> - CompactNumberFormat.java
>>>>>>>>>>
>>>>>>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>>>>>>> OK
>>>>>>>>>>
>>>>>>>>>> line 268: The link points to
>>>>>>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>>>>>>> OK. Changed it at line 165 also.
>>>>>>>>>>
>>>>>>>>>> line 855: no need to do toString(). length() can detect
>>>>>>>>>> whether it's empty or not.
>>>>>>>>>>
>>>>>>>>>> line 884: "Overloaded method" reads odd here. I'd prefer
>>>>>>>>>> specializing in the "given number" into either long or
>>>>>>>>>> biginteger.
>>>>>>>>> OK
>>>>>>>>>>
>>>>>>>>>> line 1500: subparseNumber() pretty much shares the same code
>>>>>>>>>> with DecimalFormat.subparse(). can they be merged?
>>>>>>>>> The existing CNF.subParseNumber differs in the way
>>>>>>>>> parseIntegerOnly is handled, DecimalFormat.parse()/subparse()
>>>>>>>>> behaviour is unpredictable with parseIntegeronly = true when
>>>>>>>>> multipliers are involved (Please see JDK-8199223).
>>>>>>>>>
>>>>>>>>> Also, I had thought that the CNF.parse()/subparseNumber()
>>>>>>>>> should *not *parse the exponential notation e.g. while parsing
>>>>>>>>> "1.05E4K" the parsing should break at 'E' and returns 1.05,
>>>>>>>>> because 'E' should be considered as unparseable character for
>>>>>>>>> general number format pattern or compact number pattern, but
>>>>>>>>> this is not the case with DecimalFormat.parse(). The below
>>>>>>>>> DecimalFormat general number format instance
>>>>>>>>>
>>>>>>>>> NumberFormat nf = NumberFormat.getNumberInstance();
>>>>>>>>> nf.parse("1.05E4")
>>>>>>>>>
>>>>>>>>> Successfully parse the string and returns 10500. The same
>>>>>>>>> behaviour is there with other DecimalFormat instances also
>>>>>>>>> e.g. currency instance.
>>>>>>>>>
>>>>>>>>> Do you think this is an issue with DecimalFormat.parse() and
>>>>>>>>> CNF should avoid parsing exponential numbers? Or, should
>>>>>>>>> CNF.parse() be modified to be consistent with
>>>>>>>>> DecimalFormat.parse() in this aspect?
>>>>>>>>
>>>>>>>> No, I understand there are differences. But I see a lot of
>>>>>>>> duplicated piece of code which I would like to eliminate.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply
>>>>>>>>>> calls super. No need to override them.
>>>>>>>>> Since setters are overridden, I think that it is better to
>>>>>>>>> override getters also (even if they are just calling super and
>>>>>>>>> have same javadoc) to keep them at same level. But, if you see
>>>>>>>>> no point in keeping them in CNF, I will remove them. Does that
>>>>>>>>> need CSR change?
>>>>>>>>
>>>>>>>> I don't see any point for override. I don't think there needs a
>>>>>>>> CSR, but better ask Joe about it.
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> line 2231: You need to test the type before cast. Otherwise
>>>>>>>>>> ClassCastException may be thrown.
>>>>>>>>> The type is checked in the superclass equals method getClass()
>>>>>>>>> != obj.getClass(), so I think there is no need to check the
>>>>>>>>> type here.
>>>>>>>>
>>>>>>>> OK.
>>>>>>>>
>>>>>>>> Naoto
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nishit Jain
>>>>>>>>>>
>>>>>>>>>> Naoto
>>>>>>>>>>
>>>>>>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Please review this non trivial feature addition to
>>>>>>>>>>> NumberFormat API.
>>>>>>>>>>>
>>>>>>>>>>> The existing NumberFormat API provides locale based support
>>>>>>>>>>> for formatting and parsing numbers which includes formatting
>>>>>>>>>>> decimal, percent, currency etc, but the support for
>>>>>>>>>>> formatting a number into a human readable or compact form is
>>>>>>>>>>> missing. This RFE adds that feature to format a decimal
>>>>>>>>>>> number in a compact format (e.g. 1000 -> 1K, 1000000 -> 1M
>>>>>>>>>>> in en_US locale) , which is useful for the environment where
>>>>>>>>>>> display space is limited, so that the formatted string can
>>>>>>>>>>> be displayed in that limited space. It is defined by LDML's
>>>>>>>>>>> specification for Compact Number Formats.
>>>>>>>>>>>
>>>>>>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>>>>>>> Webrev:
>>>>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>>>>>>
>>>>>>>>>>> Request to please help review the the change.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Nishit Jain
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>
More information about the core-libs-dev
mailing list