[loc-en-dev] Comments on the locale enhancement proposal

Tue Jan 20 14:52:16 PST 2009

My comments again.  I've omitted part of the thread.

On Tue, Jan 20, 2009 at 1:46 PM, Yoshito Umaoka <y.umaoka at gmail.com> wrote:

> I added my comments
>
> On Tue, Jan 20, 2009 at 4:26 PM, Doug Felt <dougfelt at google.com> wrote:
>
>> Comments inline.
>>
>> On Tue, Jan 20, 2009 at 12:25 AM, Masayoshi Okutsu <
>> Masayoshi.Okutsu at sun.com> wrote:
>>
>>>  Folks,
>>>
>>

>
>>> Should country really be added to the lang_script? (e.g., zh_Hant ->
>>> zh_Hant_TW)
>>>
>>
>> This does seem odd.  I guess the issue addressed in section 7.3 is that
>> some clients might have followed the suggestion to make data available under
>> 'zh_Hant_TW', but did not provide any under 'zh_Hant', even though 'zh_Hant'
>> was the intended semantics of the data.  I think it might be better to not
>> support this, unless there are clients who are in this situation and cannot
>> adapt.
>>
>>
>
> Obviously, many of existing Java users tag "zh_TW" for Traditional Chinese
> language contents, no matter it is actually for TW.  At the same time, some
> others uses "zh_TW" specific for TW.  When these two use cases are mixed
> with new locale ID with script, I think zh_Hant -> zh_Hant_TW would be
> safer.  But, I agree that we should review whether we really need this or
> not.
>

I think of this in terms of relabeling the data and remapping the requests.
I know, this is not how you've spec'd it.

If there is a resource labeled 'zh_TW' then I'd relabel it as follows:
- if there is no 'zh_Hant' data, treat it as 'zh_Hant'
- else if there is no 'zh_Hant_TW' data, treat it as 'zh_Hant_TW'

I think this produces essentially the same result as what you have.  I find
it a little more palatable since it's worded in terms of how to interpret
the ('legacy') data, not how to expand the requested ('new') locale id.

It does mean that people cannot use 'zh_Hant' to get traditional Chinese
while falling back to some default (chinese-speaking) country data (currency
and currency format, say) if there is zh_TW data.  If they expected 'zh' to
be populated with 'CN' data by default, they wouldn't get it.  The whole
notion of country-specific data independent of language doesn't fit the
simple locale id model well, though, so I'm not sure how much of a problem
this is.  Are there even consistent expectations to violate?

>
>>
>>> Should script be added to language? (e.g., pa -> pa_Guru)
>>>
>>
>> I wonder about this too.  Perhaps Yoshito can describe a case in which
>> this behavior is needed.  I didn't notice a motivating case.
>>
>>
>
> When you do not want to fallback one script to another, this makes sense.
> For example, some users may want to supply resources - pa_Guru and pa_Arab,
> but no pa.  Anyway, I think Mark has some thoughts on this.
>
>

This sounds like a new feature to me. We haven't supported this in the
past-- if users supplied resources 'zh_TW' and 'zh_CN' but no 'zh', and I
asked for 'zh', I'd get root (or default locale?) data.  I don't see an
argument from compatibility.  I think it would have to be justified as a new
feature.

Doug
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/locale-enhancement-dev/attachments/20090120/733d0780/attachment.html