[loc-en-dev] Unicode locale extension - what it meant to Java

Yoshito Umaoka y.umaoka at gmail.com
Wed Jun 30 17:50:21 PDT 2010


Sorry for the confusion.  The term "transform" was really ambiguous.

The current spec is not clear about this. But as you pointed out, the 
current API doc is accurate. Below is the currently proposed 
implementation -

new Locale("ja", "JP", "JP")

will create a Locale with

language - ja
country - JP
variant - JP
extension - u-ca-japanese

That means, the extension is appended.  The same behavior when a Locale 
created from older version of Java is deserialized on Java 7.

    new Locale("ja", "JP", "JP").toLanguageTag();

will return "ja-JP-u-ca-japanese-x-jvariant-JP"


    Locale.forLanguageTag("ja-JP-u-ca-japanese-x-jvariant-JP");

will return a Locale same with new Locale("ja", "JP", "JP").


    Locale.forLanguageTag("ja-JP-u-ca-japanese-x-jvariant-jp");

will return a Locale with

language - ja
country - JP
variant - jp
extension - u-ca-japanese



    new Builder().setLocale(new Locale("ja", "JP", "JP")).build();

won't throw any exceptions.  The Locale returned by above will be same 
with new Locale("ja", "JP", "JP").  This is an exceptional case (only 3 
exceptions - ja_JP_JP / th_TH_TH / no_NO_NY).


    new Builder().setLocale(new Locale("ja", "", "JP"));

will throw IllformedLocaleException.


new Builder().setLangauge("ja").setRegion("JP").setVariant("JP");

will also throw IllformedLocaleException.


new 
Builder().setLanguage("ja").setRegion("JP").setUnicodeLocaleKeyword("ca", 
"japanese").build();

This is a questionable case - but I think it should not set "JP" to variant.


-Yoshito


Naoto Sato wrote:
> I wasn't sure that Yoshito's "In the current proposal" was just for 
> the Builder. If that's the case I am fine. I want to confirm that the 
> variant that is created by the Locale constructor is intact, otherwise 
> it would cause a compatibility issue.
>
> The reason I brought this up was that the current API doc 
> ("Compatibility" section in the Locale class description) reads:
>
> "When the Locale constructor is called with the arguments "ja", "JP", 
> "JP", this extension is automatically added. "
>
> Naoto
>
> (6/30/10 4:07 PM), Doug Felt wrote:
>>
>>
>> On Wed, Jun 30, 2010 at 3:51 PM, Naoto Sato <naoto.sato at oracle.com
>> <mailto:naoto.sato at oracle.com>> wrote:
>>
>>     (6/30/10 2:08 PM), Yoshito Umaoka wrote:
>>
>>         First of all, I'm not trying to retract Unicode locale extension
>>         part of
>>         proposal proactively. But I think we need to clarify the scope
>>         of our
>>         proposal - what Unicode locale extensions meant to Java itself.
>>
>>         We want to bring Unicode locale extension to Java world. Java
>>         used to
>>         define variant to specify specific behavior variations. This
>>         model does
>>         not fit well to BCP 47.
>>
>>         Unicode locale extension give you formal/well-structured 
>> scheme for
>>         representing a variation of locale. Java Locale ja_JP_JP is used
>>         for a
>>         variant of locale ja_JP, just changing calendar type to be 
>> Japanese
>>         Imperial calendar. This is Java's proprietary definition. In the
>>         current
>>         proposal, ja_JP_JP is transformed to -u-ca-japanese.
>>
>>
>>     What do you mean by "transformed" here? I thought that
>>     "-u-ca-japanese" is just automatically added and "JP" variant is
>>     intact. Is it not?
>>
>> JP is too short to be a valid variant value in BCP47, so when converting
>> to a BCP47 identifier it is dropped.  I believe the decision is that a
>> Java locale created from a LocaleBuilder with -u-ca-japanese will not
>> return JP from getVariant, but Yoshito knows for sure, I expect :-)
>>
>> Doug
>>
>>
>>
>>         For me, adding unicode locale extension APIs in Java indicates a
>>         certain
>>         level of commitment for supporting Unicode locale extension 
>> in Java
>>         itself. However, we did not discuss about Java's i18n service
>>         implementation part much so far. We only care two exceptional
>>         cases -
>>         ja_JP_JP and th_TH_TH at this moment. But, if we once expose 
>> Unicode
>>         locale extension in Java, Java users may expect Currency 
>> instance
>>         created with Locale de-DE-u-cu-dem to use German Mark.
>>
>>         Of course, we need a framework first. Actual use of Unicode 
>> locale
>>         extension in Java i18n services might be done later. If we
>>         decide to add
>>         APIs dedicated for Unicode locale extensions and defer the
>>         support in
>>         i18n services, I think we should clearly state what Unicode 
>> locale
>>         extension meant to Java i18n services - what are supported, 
>> what are
>>         not, etc. I'll put this topic in the next project meeting.
>>
>>
>>     Let's separate implementation from the spec. Although we might add
>>     this type of explanation in the "supported locales" document, that's
>>     never been part of the spec.
>>
>>     Naoto
>>
>>
>
>



More information about the locale-enhancement-dev mailing list