[loc-en-dev] variant field casing

Yoshito Umaoka y.umaoka at gmail.com
Mon Jul 20 10:58:03 PDT 2009


In the conf call last week, we (Doug, Steven and myself) agreed followings -

1. Locale constructor does not normalize variant field value.  So case 
is preserved. (Same as current behavior)
2. Locale#equals does case SENSITIVE compare for variant fields. (Same 
as current behavior)
3. Locale#Builder accept both uppercase/lowercase/mixed-case variant 
values (still need to satisfy BCP47 variant syntax), but always 
normalized to uppercase letters internally.  Therefore, an instance of 
Locale created by Builder use uppercase letters for variant field always.
4. Locale#forLanguageTag normalizes a BCP47 variant value to uppercase.
5. Locale#toLanguageTag normalizes a Locale variant field value to 
lowercase.

With this implementation, there is no backward compatibility issues.  
Java users who want to make Locales compatible/exchangeable with BCP47 
language tags are recommended to use upper case variant value in Java 
Locale, otherwise, roundtrip will fail.

Do you think this solution is acceptable?  If you have any objections, 
please respond.

-Yoshito

> The API doc isn't clear at all, but the documentation gives some 
> "impression" that the case is preserved for the variant argument. It's 
> likely to encounter compatibility problems if we change the behavior.
>
> Masayoshi
>
> On 7/10/2009 3:26 AM, Yoshito Umaoka wrote:
>> Hi folks,
>>
>> Variant field in Java Locale is case sensitive.  For example -
>>
>> System.out.println(new Locale("th", "TH", "TH").equals(new 
>> Locale("th", "TH", "th")));
>> System.out.println(new Locale("th", "TH", "TH").equals(new 
>> Locale("th", "th", "TH")));
>>
>> These statements print out the results -
>>
>> false
>> true
>>
>> I cannot see any descriptions about variant field casing in the API 
>> doc.  I think this behavior is problematic if we want to make Locale 
>> to handle language tags (which is case in-sensitive) properly.  I 
>> propose to document that all locale fields are case insensitive and 
>> change the behavior.  I know this is not backward compatible, but we 
>> probably should correct the behavior.
>>
>> By the way, IANA language subtag registry uses lower case letters for 
>> variant subtags.  And the format used in the registry is the 
>> preferred casing.
>>
>> -Yoshito
>>
>




More information about the locale-enhancement-dev mailing list