<i18n dev> RFR 8177552: Compact Number Formatting support
Nishit Jain
nishit.jain at oracle.com
Tue Nov 27 07:11:55 UTC 2018
Hi Naoto,
On 26-11-2018 21:01, naoto.sato at oracle.com wrote:
> Hi Nishit,
>
> On 11/26/18 12:41 AM, Nishit Jain wrote:
>> Hi Naoto,
>>
>> To add to my previous mail comment, the DecimalFormat spec also says
>> that
>>
>> /*"DecimalFormat can be instructed to format and parse scientific
>> notation only via a pattern; there is currently no factory method
>> that creates a scientific notation format. In a pattern, the exponent
>> character immediately followed by one or more digit characters
>> indicates scientific notation. "
>>
>> */That is, exponent formatting and parsing is instructed only via a
>> scientific notation pattern and I think should not be there with
>> *general number* formatting.
>
> I am not sure the quoted sentence should be interpreted that way. My
> understanding is that the section means there is no public
> NumberFormat.getScientificInstance() method (cf. line 601 at
> NumberFormat.java), so that users will have to use 'E' in their
> pattern string.
>
> Anyway, my point is that if you prefer to treat the scientific
> notation differently between DecimalFormat and CompactDecimalFormat,
> then it will need to be clarified in the spec. Personally I agree that
> it is not practical to interpret E in the CNF.
OK. If it is better to specify the parsing behavior w.r.t. the
exponential numbers, I have added a statement in the parse() API.
*/"CompactNumberFormat parse does not allow parsing exponential number
strings. For example, parsing a string "1.05E4K" in US locale breaks at
character 'E' and returns 1.05."/*
Updated the webrev
http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.03/
Will also update the CSR and refinalize it.
Regards,
Nishit Jain
>
> Naoto
>
>>
>> Updated webrev based on the other comments
>>
>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/
>>
>> > Some more comments (all in CompactNumberFormat.java)
>>
>> > line 807: expandAffix() seems to treat localizable special pattern
>> characters, but currently the implementation only cares for the minus
>> sign. Should other localizable pattern chars be taken care of, such
>> as percent sign?
>> - Other special characters like '%' percent sign are not allowed as
>> per CNF compact pattern spec
>>
>> > line 869, 888: Define what -1 means as a ret value.
>> - OK.
>>
>> > line 897: iterMultiplier be better all capitalized as it is a
>> constant. And it could be statically defined in the class to be
>> shared with other locations that use "10" for arithmetic operation.
>> - OK, made it static final and renamed it as RANGE_MULTIPLIER
>>
>> > line 1531: Any possibility this could lead to divide-by-zero?
>> - None which I am aware of, unless you are pointing to the issue like
>> JDK-8211161, which we know is not an issue.
>>
>> Regards,
>> Nishit Jain
>> On 23-11-2018 15:55, Nishit Jain wrote:
>>> Hi Naoto,
>>>
>>> > I think DecimalFormat and CNF should behave the same, ie. 'E'
>>> should be treated as the exponent without a quote.
>>>
>>> Personally I don't think that the exponential parsing should be
>>> supported by CompactNumberFormat, because the objective of compact
>>> numbers is to represent numbers in short form. So, parsing of number
>>> format like "1.05E4K" should not be expected from
>>> CompactNumberFormat, I am even doubtful that such forms ("1.05E4K")
>>> are used anywhere where exponential and compact form are together
>>> used. If formatting and parsing of exponential numbers are needed it
>>> should be done by DecimalFormat scientific instance *not *with the
>>> general number instance.So, I don't think that we should allow
>>> parsing of exponential numbers.Comments welcome.
>>>
>>> Regards,
>>> Nishit Jain
>>> On 22-11-2018 02:02, naoto.sato at oracle.com wrote:
>>>> Hi Nishit,
>>>>
>>>> On 11/21/18 12:53 AM, Nishit Jain wrote:
>>>>> Hi Naoto,
>>>>>
>>>>> Updated the webrev based on suggestions
>>>>>
>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>>>>
>>>>> Changes made:
>>>>> - Replaced List<String> with String[] to be added to the the
>>>>> resource bundles
>>>>
>>>> Good.
>>>>
>>>>> - refactored DecimalFormat.subparse() to be used by the
>>>>> CNF.parse(), to reduce code duplication.
>>>>
>>>> I presume CNF is calling package-private methods in DF to share the
>>>> same code. Some comments noting the sharing would be helpful.
>>>>
>>>>> - Also updated it with other changes as suggested in the comments
>>>>
>>>> Sorry I missed your question the last time:
>>>> >>> Do you think this is an issue with DecimalFormat.parse() and CNF
>>>> >>> should avoid parsing exponential numbers? Or, should
>>>> CNF.parse() be
>>>> >>> modified to be consistent with DecimalFormat.parse() in this
>>>> aspect?
>>>>
>>>> I think DecimalFormat and CNF should behave the same, ie. 'E'
>>>> should be treated as the exponent without a quote.
>>>>
>>>> Some more comments (all in CompactNumberFormat.java)
>>>>
>>>> line 807: expandAffix() seems to treat localizable special pattern
>>>> characters, but currently the implementation only cares for the
>>>> minus sign. Should other localizable pattern chars be taken care
>>>> of, such as percent sign?
>>>>
>>>> line 869, 888: Define what -1 means as a ret value.
>>>>
>>>> line 897: iterMultiplier be better all capitalized as it is a
>>>> constant. And it could be statically defined in the class to be
>>>> shared with other locations that use "10" for arithmetic operation.
>>>>
>>>> line 1531: Any possibility this could lead to divide-by-zero?
>>>>
>>>> Naoto
>>>>
>>>>>
>>>>> Regards,
>>>>> Nishit Jain
>>>>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>>>>> Hi Nishit,
>>>>>>
>>>>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>>>>> Hi Naoto,
>>>>>>>
>>>>>>> Please check my comments inline.
>>>>>>>
>>>>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>>>>> Hi Nishit,
>>>>>>>>
>>>>>>>> Here are my comments:
>>>>>>>>
>>>>>>>> - CLDRConverter: As the compact pattern no more employs
>>>>>>>> List<String>, can we eliminate stringListEntry/Element, and use
>>>>>>>> Array equivalent instead?
>>>>>>> Since the CNF design does not put any limit on the size of
>>>>>>> compact pattern, so at the time of parsing the CLDR xmls using
>>>>>>> SAX parser, it becomes difficult to identify the size of array
>>>>>>> when the parent element of compact pattern is encountered, so I
>>>>>>> think it is better to keep the List<String> while extracting the
>>>>>>> resources.
>>>>>>
>>>>>> OK. However I'd not keep the List<String> format on generating
>>>>>> the resource bundle, as there is no reason to introduce yet
>>>>>> another bundle format other than the existing array of String.
>>>>>>
>>>>>>>>
>>>>>>>> - CompactNumberFormat.java
>>>>>>>>
>>>>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>>>>> OK
>>>>>>>>
>>>>>>>> line 268: The link points to
>>>>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>>>>> OK. Changed it at line 165 also.
>>>>>>>>
>>>>>>>> line 855: no need to do toString(). length() can detect whether
>>>>>>>> it's empty or not.
>>>>>>>>
>>>>>>>> line 884: "Overloaded method" reads odd here. I'd prefer
>>>>>>>> specializing in the "given number" into either long or biginteger.
>>>>>>> OK
>>>>>>>>
>>>>>>>> line 1500: subparseNumber() pretty much shares the same code
>>>>>>>> with DecimalFormat.subparse(). can they be merged?
>>>>>>> The existing CNF.subParseNumber differs in the way
>>>>>>> parseIntegerOnly is handled, DecimalFormat.parse()/subparse()
>>>>>>> behaviour is unpredictable with parseIntegeronly = true when
>>>>>>> multipliers are involved (Please see JDK-8199223).
>>>>>>>
>>>>>>> Also, I had thought that the CNF.parse()/subparseNumber() should
>>>>>>> *not *parse the exponential notation e.g. while parsing
>>>>>>> "1.05E4K" the parsing should break at 'E' and returns 1.05,
>>>>>>> because 'E' should be considered as unparseable character for
>>>>>>> general number format pattern or compact number pattern, but
>>>>>>> this is not the case with DecimalFormat.parse(). The below
>>>>>>> DecimalFormat general number format instance
>>>>>>>
>>>>>>> NumberFormat nf = NumberFormat.getNumberInstance();
>>>>>>> nf.parse("1.05E4")
>>>>>>>
>>>>>>> Successfully parse the string and returns 10500. The same
>>>>>>> behaviour is there with other DecimalFormat instances also e.g.
>>>>>>> currency instance.
>>>>>>>
>>>>>>> Do you think this is an issue with DecimalFormat.parse() and CNF
>>>>>>> should avoid parsing exponential numbers? Or, should CNF.parse()
>>>>>>> be modified to be consistent with DecimalFormat.parse() in this
>>>>>>> aspect?
>>>>>>
>>>>>> No, I understand there are differences. But I see a lot of
>>>>>> duplicated piece of code which I would like to eliminate.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply
>>>>>>>> calls super. No need to override them.
>>>>>>> Since setters are overridden, I think that it is better to
>>>>>>> override getters also (even if they are just calling super and
>>>>>>> have same javadoc) to keep them at same level. But, if you see
>>>>>>> no point in keeping them in CNF, I will remove them. Does that
>>>>>>> need CSR change?
>>>>>>
>>>>>> I don't see any point for override. I don't think there needs a
>>>>>> CSR, but better ask Joe about it.
>>>>>>
>>>>>>>>
>>>>>>>> line 2231: You need to test the type before cast. Otherwise
>>>>>>>> ClassCastException may be thrown.
>>>>>>> The type is checked in the superclass equals method getClass()
>>>>>>> != obj.getClass(), so I think there is no need to check the type
>>>>>>> here.
>>>>>>
>>>>>> OK.
>>>>>>
>>>>>> Naoto
>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nishit Jain
>>>>>>>>
>>>>>>>> Naoto
>>>>>>>>
>>>>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please review this non trivial feature addition to
>>>>>>>>> NumberFormat API.
>>>>>>>>>
>>>>>>>>> The existing NumberFormat API provides locale based support
>>>>>>>>> for formatting and parsing numbers which includes formatting
>>>>>>>>> decimal, percent, currency etc, but the support for formatting
>>>>>>>>> a number into a human readable or compact form is missing.
>>>>>>>>> This RFE adds that feature to format a decimal number in a
>>>>>>>>> compact format (e.g. 1000 -> 1K, 1000000 -> 1M in en_US
>>>>>>>>> locale) , which is useful for the environment where display
>>>>>>>>> space is limited, so that the formatted string can be
>>>>>>>>> displayed in that limited space. It is defined by LDML's
>>>>>>>>> specification for Compact Number Formats.
>>>>>>>>>
>>>>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>>>>
>>>>>>>>> Request to please help review the the change.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nishit Jain
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>
More information about the core-libs-dev
mailing list