[loc-en-dev] Comments on the locale enhancement proposal
Masayoshi Okutsu
Masayoshi.Okutsu at Sun.COM
Mon Feb 2 20:01:00 PST 2009
On 2/3/2009 12:33 PM, Yoshito Umaoka wrote:
> Masayoshi Okutsu wrote:
>
>>> Let's assume an instance of Locale is created from language tag
>>> "zh-Hans-CN". The proposal suggest Locale#toString() to return
>>> "zh_Hans_CN". Do you think this behavior is problematic? Are you
>>> suggesting to add a new method, for exmaple, Locale#getID() to
>>> return "zh_Hans_CN", but not to put the script "Hans" and extra
>>> separator "_" in the result of #toString()?
>>
>> I think returning "zh_Hans_CN" may cause a problem. Let's think about
>> the following scenario.
>>
>> (1) Application A and B communicate through RMI (i.e., serialization).
>> (2) A is script-aware, while B may be or may not.
>> (3) B uses 3rd party class library L which isn't script-aware.
>>
>> Suppose both A and B are running in JDK 7, and that A sends a Locale
>> from "zh-Hans-CN" to B. B passes the given Locale to L. In this case,
>> L might be confused with "zh_Hans_CN" from toString().
>>
>> We could say, "Don't do that." But if someone complains it's an
>> incompatible change in JDK 7, we will need to give up the new
>> behavior of toString(). If the complaint comes after the JDK 7
>> release, it will be a tragedy...
>
> I do not understand what you wrote above. Locale has 3 member fields
> - language, country and variant. When an instance of Locale is being
> serialized, these fields are preserved in the serialized form. Even
> we internally add extra fields or change the internal representation
> of these fields, we have to write out these 3 separated fields for
> supporting serialization compatibility. In the scenario above, I
> would expect Locale("zh", "CN") at the other end (pre-JDK7).
If B was running in pre-JDK7, it's true that the deserialized Locale is
zh_CN. But in my scenario both A and B are running in JDK 7. (B might
want to use some new APIs of JDK 7 while B needs to continue to use
library L.)
Thanks,
Masayoshi
> Of course, it loses the script information, which is not ideal, but
> at least the problem which you mentioned above should not happen.
>
> It is true that there might be an existing application depending on
> its String representation and making an assumption - A locale string
> consist from up to 3 fields delimitted by "_" - 1st one is language,
> 2nd one is country and the rest is variant. If we need to avoid this
> - we could -
>
> 1. toString by the Java convension, we still want to write out entire
> fields information, including script, extensions... If we append
> these information to the end of variant, I would expect the impact is
> minimum.
>
> 2. With the change above, we want another method to return formal
> "programmatic name". Probably we need to add getID() to do so. If we
> decided to go this way, we should update the document to encourage
> people to use getID() instead of toString() to get a canonical string
> representation of a Locale.
>
> Although we could do such things for supporting full backward
> compatibility, I prefer not to do so.
>
> Am I missing anything?
>
> Thanks,
> Yoshito
>
>
>
More information about the locale-enhancement-dev
mailing list