<i18n dev> [8] Request for review: 8029239: jdk8 l10n resource file translation update - localenames

Wed Dec 4 23:23:33 PST 2013

Thanks for making the \uxxxx changes. It's much easier to review.

I checked LocaleNames_ja.properties with JIS X 0412 (Japanese version of 
ISO 639) and JIS X 0304 (Japanese version of ISO 3166). Many changes 
corrected wrong names, but there are still questionable ones, and some 
are inconsistent with JIS X 0412/0304, such as 中華人民共和国 to 中国 
for CN because 中国 is too ambiguous [1]. My preference is to follow the 
JIS standards (where applicable).

Also I noticed some new script names are missing, such as Jurc for 
Jurchen. Should this problem be out of the scope with the 8029239 fix?

Thanks,
Masayoshi

[1] 
http://ja.wikipedia.org/wiki/%E4%B8%AD%E5%9B%BD_%28%E6%9B%96%E6%98%A7%E3%81%95%E5%9B%9E%E9%81%BF%29

On 12/5/2013 8:02 AM, Michael Fang wrote:
> Thanks Masayoshi for the feedback.
>
> I looked at the existing LocaleNames_xx.properties files and it 
> contains lower case \uxxxx notation for most of sections of the files, 
> but newer sections as ISO 639.2 language code and ISO 15924 script 
> code were in upper case since they were added by the current 
> translation team.
>
> I converted the files to all lower case \uxxxx notation and the webrev 
> didn't help (worse with even more deltas):
> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.02/
>
> To make it easier to review, I have created updated version of webrev 
> that ignores cases with "diff -i" option.
> http://cr.openjdk.java.net/~mfang/8029239/webrev_ignore_case.ksh
> But I kept the upper case \uxxxx notation because that's what I will 
> be receiving from translation team going forward. In the future, the 
> diff should be more manageable.
>
> The resulting webrev:
> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.03/
> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.03.utf8/
>
> I need to check with translation team about the Japanese translation 
> issue for "afa".
>
> I have added bug# to test/sun/text/resources/LocaleDataTest.java
>
> thanks,
>
> -michael
>
> On 13年12月04日 01:17 上午, Masayoshi Okutsu wrote:
>> I still don't like the unnecessary case changes for the \uxxxx 
>> notation, which makes code review very difficult (even UTF-8 ones).
>>
>> I noticed the following change uses '=' (the second one) as a word 
>> delimiter. This looks strange to me.
>>
>> src/share/classes/sun/util/resources/ja/LocaleNames_ja.properties:
>> -afa=\u30A2\u30D5\u30AC\u30CB\u30FC (1927-2002)
>> +afa=\u30A2\u30D5\u30ED=\u30A2\u30B8\u30A2\u8A9E\u65CF
>>
>> Should the bug# be added to 
>> test/sun/text/resources/LocaleDataTest.java when LocaleData has been 
>> modified?
>>
>> Thanks,
>> Masayoshi
>>
>> On 12/4/2013 1:02 PM, Michael Fang wrote:
>>> I have updated the webrev 
>>> http://cr.openjdk.java.net/~mfang/8029239/webrev.jdk.01/
>>>
>>> 2 failed regression tests have been addressed:
>>>
>>> ----------System.out:(7/192)*----------
>>> Mismatch in LocaleNames/pt/wa:
>>>   file = "val\\\\u00e3o"
>>>    jvm = "val\\u00e3o"
>>> Mismatch in LocaleNames/pt/FM:
>>>   file = "Micron\\\\u00e9sia"
>>>    jvm = "Micron\\u00e9sia"
>>> Test failed.  2 errors.
>>>
>>>
>>> thanks,
>>>
>>> -michael
>>>
>>> On 13年12月03日 05:10 下午, Michael Fang wrote:
>>>> Hi,
>>>>
>>>> Please help to review the changes for the following CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8029239
>>>>
>>>> The webrev is available here:
>>>> http://cr.openjdk.java.net/~mfang/8029239/
>>>>
>>>> The changes included the following:
>>>>
>>>> Portuguese files were updated with resources generated from CLDR 
>>>> 2.1. LocaleNames_pt_BR.properties have been removed as a result. 
>>>> Regression tests were performed and failed tests were updated to be 
>>>> compatible with the changes to CLDR 2.1.
>>>>
>>>> There are also translation capitalization changes. I did some 
>>>> research about capitalization rules and found that for Spanish, 
>>>> French, Italian, Portuguese, and Swedish, language names are not 
>>>> capitalized, but country names are and these are consistent with 
>>>> CLDR as well.
>>>>
>>>> References about capitalization
>>>> http://meta.wikimedia.org/wiki/Capitalization_of_Wiktionary_pages#Languages_without_capitalization: 
>>>> es, fr, it, sv, pt, etc
>>>>
>>>> Spanish: 
>>>> http://www.howto.gov/web-content/multilingual/spanish-guide/capitalization
>>>> French: http://french.about.com/library/writing/bl-capitalization.htm
>>>> Italian: http://www.translationdirectory.com/article714.htm
>>>> Swedish: http://en.wikibooks.org/wiki/Swedish/Nationalities
>>>>
>>>> There are also translation changes performed by translators. I 
>>>> reviewed each change against CLDR. I only accepted the changes 
>>>> that's consistent with CLDR since I don't know the languages and 
>>>> could only use CLDR as a reference.
>>>>
>>>> I did not add new regression tests for the translation changes 
>>>> since there are no formatting changes, just translation changes. We 
>>>> have a keyword noreg-l10n that covers the exception for requiring 
>>>> regression tests for translation changes.
>>>>
>>>> thanks,
>>>>
>>>> -michael
>>>
>>
>