[loc-en-dev] Duplicate Unicode Locale Extension Attributes/Keys
Yoshito Umaoka
y.umaoka at gmail.com
Wed Jul 7 15:18:57 PDT 2010
> 3) Locale.forLanguageTag("en-a-abc-a-def-x-123") and new
> Builder().setLanguageTag("en-a-abc-a-def-x-123")
>
> When two extensions with a same key are in a language tag, how should
> we interpret this? The same question also applies to duplicated
> Unicode attributes / multiple Unicode locale keyword with a same key.
>
> We should decide if we simply treat this as an error (forLanguage tag
> truncate anything second "a" / setLanguageTag to throw an exception
> and set error index at the second "a"). Alternatively, we just ignore
> the second one ("a-def") and continue parsing.
>
> We'll make the final decision for them in the next call (next
> Monday). I'll write design note explaining detailed behavior for the
> rest of area by the next call and we'll go through the document.
Mark, Addison and myself discussed about the duplicate attributes/keys.
Our conclusion is -
- Still legal (we did not want to introduce extra check for this), but
implementation should not create such extension.
- Ignore later occurrence of attributes/keywords when parsing.
- Canonical form removes duplicate attributes/keys (including associated
type).
And added the statement below in the latest IETF draft:
"Only the first occurrence of an attributes or key conveys meaning in
a language tag. When interpreting tags containing the Unicode locale
extension, duplicate attributes or keywords are ignored in the
following way: ignore any attribute that has already appeared in the
tag and ignore any keyword whose key has already occurred in the tag."
In our proposal, Builder#setUnicodeLocaleAttribute and
#setUnicodeLocaleKeyword continue to override values previously set -
which eliminate such duplication.
For parsing, Locale#forLanguageTag / Builder#setLanguageTag won't handle
duplication as error and just keep parsing the rest. The later
occurrence of duplicate attributes/keys (including type) are simply
ignored. For example,
Locale.forLanguageTag("en-a-abc-a-def-x-123").toString() ->
"en__#a-abc-x-123"
new Builder().setLanguageTag("en-a-abc-a-def-x-123").build().toString()
-> "en__#a-abc-x-123"
Locale.forLanguageTag("en-u-abc-def-abc-ca-gregory-ca-islamic-cu-usd").toString()
-> "en__#u-abc-def-ca-gregory-cu-usd"
-Yoshito
More information about the locale-enhancement-dev
mailing list