<i18n dev> [8] Request for review: 8029239: jdk8 l10n resource file translation update - localenames

Wed Jan 8 06:47:26 PST 2014

Thanks Masayoshi for the review and comment.

-michael

On 14年01月08日 01:37 上午, Masayoshi Okutsu wrote:
> Hi Michael,
>
> This fix looks good to me.
>
>> if the users hardcode the names in their program or tests,
>
> I don't think names hardcoded by apps should be concerned. My concern 
> is names produced using the public API. For example, two apps (client 
> and server) running on different Java versions may not be able to 
> perform consistent display name-based operations. Name changes should 
> be minimized for fixing wrong names IMHO.
>
> Thanks,
> Masayoshi
>
> On 1/8/2014 1:02 PM, Michael Fang wrote:
>> Hi Masayoshi,
>>
>> Since the changes to the LocaleNames translation might have slight 
>> chance of backward compatibility risk (if the users hardcode the 
>> names in their program or tests, both not recommended), I have 
>> decided to re-do the work and only apply missing names to existing 
>> files. The webrev is much cleaner now:
>>
>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.04/
>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.04.utf8/
>>
>> If there are missing script names, they need to be added to English 
>> LocaleNames.properties file first, then I can coordinate the 
>> translation. This will happen after jdk8.
>>
>> thanks,
>>
>> -michael
>>
>> On 13年12月04日 11:23 下午, Masayoshi Okutsu wrote:
>>> Thanks for making the \uxxxx changes. It's much easier to review.
>>>
>>> I checked LocaleNames_ja.properties with JIS X 0412 (Japanese 
>>> version of ISO 639) and JIS X 0304 (Japanese version of ISO 3166). 
>>> Many changes corrected wrong names, but there are still questionable 
>>> ones, and some are inconsistent with JIS X 0412/0304, such as 中華人 
>>> 民共和 国 to 中国 for CN because 中国 is too ambiguous [1]. My 
>>> preference is to follow the JIS standards (where applicable).
>>>
>>> Also I noticed some new script names are missing, such as Jurc for 
>>> Jurchen. Should this problem be out of the scope with the 8029239 fix?
>>>
>>> Thanks,
>>> Masayoshi
>>>
>>> [1] 
>>> http://ja.wikipedia.org/wiki/%E4%B8%AD%E5%9B%BD_%28%E6%9B%96%E6%98%A7%E3%81%95%E5%9B%9E%E9%81%BF%29
>>>
>>> On 12/5/2013 8:02 AM, Michael Fang wrote:
>>>> Thanks Masayoshi for the feedback.
>>>>
>>>> I looked at the existing LocaleNames_xx.properties files and it 
>>>> contains lower case \uxxxx notation for most of sections of the 
>>>> files, but newer sections as ISO 639.2 language code and ISO 15924 
>>>> script code were in upper case since they were added by the current 
>>>> translation team.
>>>>
>>>> I converted the files to all lower case \uxxxx notation and the 
>>>> webrev didn't help (worse with even more deltas):
>>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.02/
>>>>
>>>> To make it easier to review, I have created updated version of 
>>>> webrev that ignores cases with "diff -i" option.
>>>> http://cr.openjdk.java.net/~mfang/8029239/webrev_ignore_case.ksh
>>>> But I kept the upper case \uxxxx notation because that's what I 
>>>> will be receiving from translation team going forward. In the 
>>>> future, the diff should be more manageable.
>>>>
>>>> The resulting webrev:
>>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.03/
>>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.03.utf8/
>>>>
>>>> I need to check with translation team about the Japanese 
>>>> translation issue for "afa".
>>>>
>>>> I have added bug# to test/sun/text/resources/LocaleDataTest.java
>>>>
>>>> thanks,
>>>>
>>>> -michael
>>>>
>>>> On 13年12月04日 01:17 上午, Masayoshi Okutsu wrote:
>>>>> I still don't like the unnecessary case changes for the \uxxxx 
>>>>> notation, which makes code review very difficult (even UTF-8 ones).
>>>>>
>>>>> I noticed the following change uses '=' (the second one) as a word 
>>>>> delimiter. This looks strange to me.
>>>>>
>>>>> src/share/classes/sun/util/resources/ja/LocaleNames_ja.properties:
>>>>> -afa=\u30A2\u30D5\u30AC\u30CB\u30FC (1927-2002)
>>>>> +afa=\u30A2\u30D5\u30ED=\u30A2\u30B8\u30A2\u8A9E\u65CF
>>>>>
>>>>> Should the bug# be added to 
>>>>> test/sun/text/resources/LocaleDataTest.java when LocaleData has 
>>>>> been modified?
>>>>>
>>>>> Thanks,
>>>>> Masayoshi
>>>>>
>>>>> On 12/4/2013 1:02 PM, Michael Fang wrote:
>>>>>> I have updated the webrev 
>>>>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.01/
>>>>>>
>>>>>> 2 failed regression tests have been addressed:
>>>>>>
>>>>>> ----------System.out:(7/192)*----------
>>>>>> Mismatch in LocaleNames/pt/wa:
>>>>>>   file = "val\\\\u00e3o"
>>>>>>    jvm = "val\\u00e3o"
>>>>>> Mismatch in LocaleNames/pt/FM:
>>>>>>   file = "Micron\\\\u00e9sia"
>>>>>>    jvm = "Micron\\u00e9sia"
>>>>>> Test failed.  2 errors.
>>>>>>
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> -michael
>>>>>>
>>>>>> On 13年12月03日 05:10 下午, Michael Fang wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Please help to review the changes for the following CR:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8029239
>>>>>>>
>>>>>>> The webrev is available here:
>>>>>>> http://cr.openjdk.java.net/~mfang/8029239/
>>>>>>>
>>>>>>> The changes included the following:
>>>>>>>
>>>>>>> Portuguese files were updated with resources generated from CLDR 
>>>>>>> 2.1. LocaleNames_pt_BR.properties have been removed as a result. 
>>>>>>> Regression tests were performed and failed tests were updated to 
>>>>>>> be compatible with the changes to CLDR 2.1.
>>>>>>>
>>>>>>> There are also translation capitalization changes. I did some 
>>>>>>> research about capitalization rules and found that for Spanish, 
>>>>>>> French, Italian, Portuguese, and Swedish, language names are not 
>>>>>>> capitalized, but country names are and these are consistent with 
>>>>>>> CLDR as well.
>>>>>>>
>>>>>>> References about capitalization
>>>>>>> http://meta.wikimedia.org/wiki/Capitalization_of_Wiktionary_pages#Languages_without_capitalization: 
>>>>>>> es, fr, it, sv, pt, etc
>>>>>>>
>>>>>>> Spanish: 
>>>>>>> http://www.howto.gov/web-content/multilingual/spanish-guide/capitalization
>>>>>>> French: 
>>>>>>> http://french.about.com/library/writing/bl-capitalization.htm
>>>>>>> Italian: http://www.translationdirectory.com/article714.htm
>>>>>>> Swedish: http://en.wikibooks.org/wiki/Swedish/Nationalities
>>>>>>>
>>>>>>> There are also translation changes performed by translators. I 
>>>>>>> reviewed each change against CLDR. I only accepted the changes 
>>>>>>> that's consistent with CLDR since I don't know the languages and 
>>>>>>> could only use CLDR as a reference.
>>>>>>>
>>>>>>> I did not add new regression tests for the translation changes 
>>>>>>> since there are no formatting changes, just translation changes. 
>>>>>>> We have a keyword noreg-l10n that covers the exception for 
>>>>>>> requiring regression tests for translation changes.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> -michael
>>>>>>
>>>>>
>>>>
>>>
>>
>