<i18n dev> [8] Request for review: 8029239: jdk8 l10n resource file translation update - localenames

Tue Jan 7 22:11:43 PST 2014

Hi Michael,

I think you also need to modify test/sun/text/resources/LocaleData so 
that those locale name change will be checked by regression test.

thanks,
Yong

On 2014/1/8 12:02, Michael Fang wrote:
> Hi Masayoshi,
>
> Since the changes to the LocaleNames translation might have slight 
> chance of backward compatibility risk (if the users hardcode the names 
> in their program or tests, both not recommended), I have decided to 
> re-do the work and only apply missing names to existing files. The 
> webrev is much cleaner now:
>
> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.04/
> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.04.utf8/
>
> If there are missing script names, they need to be added to English 
> LocaleNames.properties file first, then I can coordinate the 
> translation. This will happen after jdk8.
>
> thanks,
>
> -michael
>
> On 13年12月04日 11:23 下午, Masayoshi Okutsu wrote:
>> Thanks for making the \uxxxx changes. It's much easier to review.
>>
>> I checked LocaleNames_ja.properties with JIS X 0412 (Japanese version 
>> of ISO 639) and JIS X 0304 (Japanese version of ISO 3166). Many 
>> changes corrected wrong names, but there are still questionable ones, 
>> and some are inconsistent with JIS X 0412/0304, such as 中華人民共和 
>> 国 to 中国 for CN because 中国 is too ambiguous [1]. My preference is 
>> to follow the JIS standards (where applicable).
>>
>> Also I noticed some new script names are missing, such as Jurc for 
>> Jurchen. Should this problem be out of the scope with the 8029239 fix?
>>
>> Thanks,
>> Masayoshi
>>
>> [1] 
>> http://ja.wikipedia.org/wiki/%E4%B8%AD%E5%9B%BD_%28%E6%9B%96%E6%98%A7%E3%81%95%E5%9B%9E%E9%81%BF%29
>>
>> On 12/5/2013 8:02 AM, Michael Fang wrote:
>>> Thanks Masayoshi for the feedback.
>>>
>>> I looked at the existing LocaleNames_xx.properties files and it 
>>> contains lower case \uxxxx notation for most of sections of the 
>>> files, but newer sections as ISO 639.2 language code and ISO 15924 
>>> script code were in upper case since they were added by the current 
>>> translation team.
>>>
>>> I converted the files to all lower case \uxxxx notation and the 
>>> webrev didn't help (worse with even more deltas):
>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.02/
>>>
>>> To make it easier to review, I have created updated version of 
>>> webrev that ignores cases with "diff -i" option.
>>> http://cr.openjdk.java.net/~mfang/8029239/webrev_ignore_case.ksh
>>> But I kept the upper case \uxxxx notation because that's what I will 
>>> be receiving from translation team going forward. In the future, the 
>>> diff should be more manageable.
>>>
>>> The resulting webrev:
>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.03/
>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.03.utf8/
>>>
>>> I need to check with translation team about the Japanese translation 
>>> issue for "afa".
>>>
>>> I have added bug# to test/sun/text/resources/LocaleDataTest.java
>>>
>>> thanks,
>>>
>>> -michael
>>>
>>> On 13年12月04日 01:17 上午, Masayoshi Okutsu wrote:
>>>> I still don't like the unnecessary case changes for the \uxxxx 
>>>> notation, which makes code review very difficult (even UTF-8 ones).
>>>>
>>>> I noticed the following change uses '=' (the second one) as a word 
>>>> delimiter. This looks strange to me.
>>>>
>>>> src/share/classes/sun/util/resources/ja/LocaleNames_ja.properties:
>>>> -afa=\u30A2\u30D5\u30AC\u30CB\u30FC (1927-2002)
>>>> +afa=\u30A2\u30D5\u30ED=\u30A2\u30B8\u30A2\u8A9E\u65CF
>>>>
>>>> Should the bug# be added to 
>>>> test/sun/text/resources/LocaleDataTest.java when LocaleData has 
>>>> been modified?
>>>>
>>>> Thanks,
>>>> Masayoshi
>>>>
>>>> On 12/4/2013 1:02 PM, Michael Fang wrote:
>>>>> I have updated the webrev 
>>>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.01/
>>>>>
>>>>> 2 failed regression tests have been addressed:
>>>>>
>>>>> ----------System.out:(7/192)*----------
>>>>> Mismatch in LocaleNames/pt/wa:
>>>>>   file = "val\\\\u00e3o"
>>>>>    jvm = "val\\u00e3o"
>>>>> Mismatch in LocaleNames/pt/FM:
>>>>>   file = "Micron\\\\u00e9sia"
>>>>>    jvm = "Micron\\u00e9sia"
>>>>> Test failed.  2 errors.
>>>>>
>>>>>
>>>>> thanks,
>>>>>
>>>>> -michael
>>>>>
>>>>> On 13年12月03日 05:10 下午, Michael Fang wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Please help to review the changes for the following CR:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8029239
>>>>>>
>>>>>> The webrev is available here:
>>>>>> http://cr.openjdk.java.net/~mfang/8029239/
>>>>>>
>>>>>> The changes included the following:
>>>>>>
>>>>>> Portuguese files were updated with resources generated from CLDR 
>>>>>> 2.1. LocaleNames_pt_BR.properties have been removed as a result. 
>>>>>> Regression tests were performed and failed tests were updated to 
>>>>>> be compatible with the changes to CLDR 2.1.
>>>>>>
>>>>>> There are also translation capitalization changes. I did some 
>>>>>> research about capitalization rules and found that for Spanish, 
>>>>>> French, Italian, Portuguese, and Swedish, language names are not 
>>>>>> capitalized, but country names are and these are consistent with 
>>>>>> CLDR as well.
>>>>>>
>>>>>> References about capitalization
>>>>>> http://meta.wikimedia.org/wiki/Capitalization_of_Wiktionary_pages#Languages_without_capitalization: 
>>>>>> es, fr, it, sv, pt, etc
>>>>>>
>>>>>> Spanish: 
>>>>>> http://www.howto.gov/web-content/multilingual/spanish-guide/capitalization
>>>>>> French: 
>>>>>> http://french.about.com/library/writing/bl-capitalization.htm
>>>>>> Italian: http://www.translationdirectory.com/article714.htm
>>>>>> Swedish: http://en.wikibooks.org/wiki/Swedish/Nationalities
>>>>>>
>>>>>> There are also translation changes performed by translators. I 
>>>>>> reviewed each change against CLDR. I only accepted the changes 
>>>>>> that's consistent with CLDR since I don't know the languages and 
>>>>>> could only use CLDR as a reference.
>>>>>>
>>>>>> I did not add new regression tests for the translation changes 
>>>>>> since there are no formatting changes, just translation changes. 
>>>>>> We have a keyword noreg-l10n that covers the exception for 
>>>>>> requiring regression tests for translation changes.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> -michael
>>>>>
>>>>
>>>
>>
>