[loc-en-dev] variant field casing
Yoshito Umaoka
y.umaoka at gmail.com
Tue Sep 1 09:11:26 PDT 2009
>>> Let me ask another question. If you do:
>>>
>>> Locale locale = new
>>> Locale.Builder().setLanguage("es").setRegion("ES").setVariant("Traditional_WIN").create();
>>> System.out.println(locale.getVariant());
>>>
>>> what will that produce? ("Traditional_WIN" is an example from the Locale
>>> API doc.)
>>>
>> The idea is to throw an exception for variant "Traditional_WIN" and
>> clearly document that builder only accept variant value(s) which satisfy the
>> BCP47 variant syntax.
>
> What if someone wants to use ISO Language Code "he" without making changes
> to variant names? The current code might be:
>
> Locale locale = new Locale("iw", "IL", "MyVariant");
>
> Then, the user might want to change this one to:
>
> Locale locale = new
> Locale.Builder().setLanguage("he")....setVariant("MyVariant").create();
>
> But if the new one throws an exception, the user needs to stay with "iw".
>
> I think we need to have a separate variant to allow users to migrate from
> the constructor to the Locale.Builder without changing other things.
>
> Thanks,
> Masayoshi
>
This may introduce a hole in the new Builder design. We agreed not to
support new fields through constructors. We wanted to make sure
individual field specified in Builder's setter method satisfies its syntax
requirement. The current design contract is to support BCP47 syntax.
So a Locale created by the Builder is always well-formed in term of
BCP47.
Even we separate Java variant from BCP47 variant, Java variant cannot
be mapped to any fields in BCP47.
setLanguage("he") is a different story. Both "he" and "iw" satisfy the
syntax requirement and setLanguage won't throw an exception for
setLanguage("he") / setLanguage("iw"). When Builder#create() is
invoked, "he" is mapped to "iw" internally and Locale#getLanguage()
returns "iw" always.
Strictly speaking, variant field in BCP47 is used for additional
semantics to a language. Java variant such as Traditional_WIN
is semantically mapped to BCP47 extension or private use,
because it extends language tag for use in applications.
If we really want to support free formed Java variant, we could remove
the syntax check in Builder#setVariant. We can still create a Locale,
if user want to convert the Locale to BCP47 language tag, such variant
field value will be dropped. For example -
Locale locale = new Locale.Builder().setLanguage("es")
.setRegion("ES").setVariant("Traditional_WIN").create();
will create a Locale - es_ES_Traditional_WIN
However
String tag = locale.toLanguageTag();
returns "es-ES".
Also,
Locale locale1 = new Locale.Builder().setLanguageTag(
"es-ES-Traditonal-WIN");
will throw IllformedLocaleException.
When existing Java variant satisfies BCP47 variant syntax,
I think toLanguageTag should still map the variant value to BCP47
variant.
For example,
Locale locale2 = new Locale("el", "GR", "monoton");
String tag2 = locale2.toLanguageTag();
returns "el-GR-monoton", which is a valid BCP47 language tag
("monoton" is registered in the IANA registry).
For the next example,
Locale locale3 = new Locale("en", "US", "POSIX");
String tag3 = locale3.toLanguageTag();
"POSIX" satisfies the BCP47 variant syntax. The current proposal is
to lower case the variant in a language tag. But we could keep the
casing - "en-US-POSIX", although BCP47 recommend lower case
letters for variant. If the language tag was consumed by other apps,
"POSIX" might be normalized to "posix", but if the code is consumed
within Java, it can round trip and you can get the exact same locale.
In summary -
1. Builder#setVariant(String) does not check the input - will never
through an exception.
2. Locale#toLanguageTag handles variant field which does not satisfy
the BCP47 variant syntax (5*8alphanum / (DIGIT 3alphanum) as an
error (as the result, variant and following fields are stripped off).
3. Locale#toLanguageTag maps Java variant to BCP47 variant if
it satisfies the syntax requirement. But it won't change the casing.
How about this approach?
-Yoshito
More information about the locale-enhancement-dev
mailing list