<i18n dev> RFR 8177552: Compact Number Formatting support

Mon Nov 26 07:56:36 UTC 2018

Hi Roger,

Please find my comments belowand check the updated webrev.

http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.02/

Regards,
Nishit Jain
On 22-11-2018 00:04, Roger Riggs wrote:
> Hi Nishit,
>
> Comments on the tests:
>
> - The tests looks to be quite complete.
>
> - Have the locale specific data been independently verified?
>    Or are they just assumed to be correct based on using CNF to 
> generate the formatted strings?
I have manually verified the data against the compact patterns in the 
resource bundles to the best I can. In some cases, like testing CNF 
rounding behavior "TestCNFRounding", the data is also verified against 
the output produced by DecimalFormat.
>
> - Is there any overlap between the format and parse patterns that can 
> be removed;
>    using the same dataprovider for both format and parse (and an extra 
> provider for unique cases).
Yes, there were patterns overlap in CompactPatternsValidity (now renamed 
as TestCompactPatternsValidity), patterns are taken out and shared 
between format and parse.
>
> - Using TestNG consistently would improve the test suite.
OK. Updated EqualityCheck and SerializationTest (now named as 
TestEquality and TestSerialization) to use TestNG.
>
> - In comments, Capitalize the first word
>
> - The names of the test files should be more consistent, some include 
> Test at the beginning,
>    some at the end and some not at all.  The utility classes 
> (CompactFormatAndParse) name
>    doesn't make it clear it is not a test itself.
Updated.
>
> Serialization Test: should be comparing the fields of the Format 
> instances,
> not only that it formats a value the same.
It also compares the equality (if (!fmt.equals(obj)))  so fields of the 
instances are also matched.
>
> To setup for future revisions, several serialized CNF instances should 
> be hardcoded
> in the test and deserialized to be checked against the current CNF 
> instances.
Added TestDeserializeCNF.java which deserializes the hardcoded instances 
in cnf1.ser.txt and cnf2.ser.txt. In the comments, also added the API 
used to create the hexdump of the serializable instance, please check if 
that is the correct way.
>
> Using testng dataproviders would show a more regular structure.
I do not find the use of data provider to be useful here, as we just 
have some instances which are serialized and deserialized with no 
specific data to test.
>
> CompactFormatAndParse.java:
>  - The method don't need "public" since they are used only in the test.
>  - unused import of BigInteger
OK
>
> EqualityCheck:
>  - Its good form to always have an @run line, even if for default 
> behavior.
Moved it to use TestNG and added corresponding @run line
>
>  - The CNF.equals method includes both symbols and decimal pattern;
>    are there tests for those being the different?
Thanks. Added.
>
> CompactPatternsValidity.java:
>  -60:  Indentation of continued data array values would make it more 
> readable.
OK
>
>  - Is there any overlap between the format and parse patterns that can 
> be removed?
>    Using the same dataprovider for both format and parse (and an extra 
> provider for unique cases).
Yes, modified CompactPatternsValidity.java, as mentioned in the above 
comment
>
> CNFRoundingTest.java:
>  - Can the Rounding mode test methods be consolidated and pass in the 
> desired rounding mode.
>   It would save on some boilerplate.
Yes, updated.

Regards,
Nishit Jain
>
> Thanks, Roger
>
>
> On 11/21/2018 03:53 AM, Nishit Jain wrote:
>> Hi Naoto,
>>
>> Updated the webrev based on suggestions
>>
>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.01/
>>
>> Changes made:
>> - Replaced List<String> with String[] to be added to the the resource 
>> bundles
>> - refactored DecimalFormat.subparse() to be used by the CNF.parse(), 
>> to reduce code duplication.
>> - Also updated it with other changes as suggested in the comments
>>
>> Regards,
>> Nishit Jain
>> On 20-11-2018 00:33, naoto.sato at oracle.com wrote:
>>> Hi Nishit,
>>>
>>> On 11/18/18 10:29 PM, Nishit Jain wrote:
>>>> Hi Naoto,
>>>>
>>>> Please check my comments inline.
>>>>
>>>> On 17-11-2018 04:52, naoto.sato at oracle.com wrote:
>>>>> Hi Nishit,
>>>>>
>>>>> Here are my comments:
>>>>>
>>>>> - CLDRConverter: As the compact pattern no more employs 
>>>>> List<String>, can we eliminate stringListEntry/Element, and use 
>>>>> Array equivalent instead?
>>>> Since the CNF design does not put any limit on the size of compact 
>>>> pattern, so at the time of parsing the CLDR xmls using SAX parser, 
>>>> it becomes difficult to identify the size of array when the parent 
>>>> element of compact pattern is encountered, so I think it is better 
>>>> to keep the List<String> while extracting the resources.
>>>
>>> OK. However I'd not keep the List<String> format on generating the 
>>> resource bundle, as there is no reason to introduce yet another 
>>> bundle format other than the existing array of String.
>>>
>>>>>
>>>>> - CompactNumberFormat.java
>>>>>
>>>>> Multiple locations: Use StringBuilder instead of StringBuffer.
>>>> OK
>>>>>
>>>>> line 268: The link points to 
>>>>> NumberFormat.getNumberInstance(Locale) instead of DecimalFormat
>>>> OK. Changed it at line 165 also.
>>>>>
>>>>> line 855: no need to do toString(). length() can detect whether 
>>>>> it's empty or not.
>>>>>
>>>>> line 884: "Overloaded method" reads odd here. I'd prefer 
>>>>> specializing in the "given number" into either long or biginteger.
>>>> OK
>>>>>
>>>>> line 1500: subparseNumber() pretty much shares the same code with 
>>>>> DecimalFormat.subparse(). can they be merged?
>>>> The existing CNF.subParseNumber differs in the way parseIntegerOnly 
>>>> is handled, DecimalFormat.parse()/subparse() behaviour is 
>>>> unpredictable with parseIntegeronly = true when multipliers are 
>>>> involved (Please see JDK-8199223).
>>>>
>>>> Also, I had thought that the CNF.parse()/subparseNumber() should 
>>>> *not *parse the exponential notation e.g. while parsing "1.05E4K" 
>>>> the parsing should break at 'E' and returns 1.05, because 'E' 
>>>> should be considered as unparseable character for general number 
>>>> format pattern or compact number pattern, but this is not the case 
>>>> with DecimalFormat.parse(). The below DecimalFormat general number 
>>>> format instance
>>>>
>>>> NumberFormat nf =  NumberFormat.getNumberInstance();
>>>> nf.parse("1.05E4")
>>>>
>>>> Successfully parse the string and returns 10500. The same behaviour 
>>>> is there with other DecimalFormat instances also e.g. currency 
>>>> instance.
>>>>
>>>> Do you think this is an issue with DecimalFormat.parse() and CNF 
>>>> should avoid parsing exponential numbers? Or, should CNF.parse() be 
>>>> modified to be consistent with DecimalFormat.parse() in this aspect?
>>>
>>> No, I understand there are differences. But I see a lot of 
>>> duplicated piece of code which I would like to eliminate.
>>>
>>>>
>>>>>
>>>>> line 1913-1923, 1950-1960, 1987-1997, 2024-2034: It simply calls 
>>>>> super. No need to override them.
>>>> Since setters are overridden, I think that it is better to override 
>>>> getters also (even if they are just calling super and have same 
>>>> javadoc) to keep them at same level. But, if you see no point in 
>>>> keeping them in CNF, I will remove them. Does that need CSR change?
>>>
>>> I don't see any point for override. I don't think there needs a CSR, 
>>> but better ask Joe about it.
>>>
>>>>>
>>>>> line 2231: You need to test the type before cast. Otherwise 
>>>>> ClassCastException may be thrown.
>>>> The type is checked in the superclass equals method getClass() != 
>>>> obj.getClass(), so I think there is no need to check the type here.
>>>
>>> OK.
>>>
>>> Naoto
>>>
>>>>
>>>> Regards,
>>>> Nishit Jain
>>>>>
>>>>> Naoto
>>>>>
>>>>> On 11/16/18 9:54 AM, Nishit Jain wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Please review this non trivial feature addition to NumberFormat API.
>>>>>>
>>>>>> The existing NumberFormat API provides locale based support for 
>>>>>> formatting and parsing numbers which includes formatting decimal, 
>>>>>> percent, currency etc, but the support for formatting a number 
>>>>>> into a human readable or compact form is missing. This RFE adds 
>>>>>> that feature to format a decimal number in a compact format (e.g. 
>>>>>> 1000 -> 1K, 1000000 -> 1M in en_US locale) , which is useful for 
>>>>>> the environment where display space is limited, so that the 
>>>>>> formatted string can be displayed in that limited space. It is 
>>>>>> defined by LDML's specification for Compact Number Formats.
>>>>>>
>>>>>> http://unicode.org/reports/tr35/tr35-numbers.html#Compact_Number_Formats 
>>>>>>
>>>>>>
>>>>>>
>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8177552
>>>>>> Webrev: 
>>>>>> http://cr.openjdk.java.net/~nishjain/8177552/webrevs/webrev.00/
>>>>>> CSR: https://bugs.openjdk.java.net/browse/JDK-8188147
>>>>>>
>>>>>> Request to please help review the the change.
>>>>>>
>>>>>> Regards,
>>>>>> Nishit Jain
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>
>