[loc-en-dev] Equality of base locale and LocaleServiceProvider implementation
Yoshito Umaoka
y.umaoka at gmail.com
Tue Mar 17 11:22:59 PDT 2009
I think Naoto mentioned the interaction between Java and a LSP. The
lookup is done by base locale which is under Java's control. Once Java
found a LSP which support the base locale, then call a service method
with extensions first. If null is returned, then call the same provider
with the base locale only. So conceptually, look up and service
invocation are not a single fallback chain.
More specifically,
1. Requested locale is xx-yy-zzzz-ext
2. Java try to locate a service claiming to support xx-yy-zzzz
3. Java invoke the service method with xx-yy-zzzz-ext
4. If 3 is failed (null is returned), then call the same service method
with xx-yy-zzzz
5. If 4 is failed (still returning null), then Java to try xx-yy and
start over the steps 2 to 4, then xx.
One thing I'm not sure (and it was my original question) is if we really
need step 4 above.
-Yoshito
Doug Felt wrote:
> I don't understand this.
>
> If providers only advertise their base locales, why is the extension
> involved in lookup at all?
>
> Doug
>
> On Tue, Mar 17, 2009 at 11:07 AM, Naoto Sato <Naoto.Sato at sun.com
> <mailto:Naoto.Sato at sun.com>> wrote:
>
> If handling the extension is special and it's our discretion how
> to deal with it, then I would think the following fallback is the
> most compatible (for the reason Yoshito mentioned) and still meets
> the requirement from -u extension for LDML keywords, assuming that
> the providers only advertise their base locales.
>
> xx-yy-zzzz-ext
> xx-yy-zzzz
> xx-yy-ext
> xx-yy
> xx-ext
> xx
>
> Thanks,
> Naoto
>
> Doug Felt wrote:
>
> Well, perhaps Mark can clarify this passage for us Mark?
>
> As you cite:
>
> "However, an implementation MAY remove these [extensions and
> unrecognized private-use subtags] from ranges prior to
> performing the lookup, provided the implementation also
> removes them from the tags being compared"
>
> This seem to me to allow us to compare only the initial fields
> when doing lookup, while still using the full locale after
> lookup has been completed.
>
> Naoto, how would you propose dealing with the problems I cited?
>
> Doug
>
> On Mon, Mar 16, 2009 at 11:28 AM, Naoto Sato
> <Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>
> <mailto:Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>>> wrote:
>
> Well, I understand the rationale for the LDML keywords.
> However
> on the other hand, BCP 47 specifies the look up fallback as I
> described before (RFC4647, "3.4. Lookup"). Regarding the
> extensions, it reads:
>
> ---
>
> Extensions and unrecognized private-use subtags might be
> unrelated to
> a particular application of lookup. Since these subtags
> come at the
> end of the subtag sequence, they are removed first during the
> fallback process and usually pose no barrier to
> interoperability.
> However, an implementation MAY remove these from ranges
> prior to
> performing the lookup (provided the implementation also
> removes them
> from the tags being compared). Such modification is
> internal to the
> implementation and applications, protocols, or
> specifications SHOULD
> NOT remove or modify subtags in content that they return
> or forward,
> because this removes information that can be used elsewhere.
>
> ---
>
> So this expects that the extensions should first be removed
> when
> fallback happens. I am not sure whether removing the left
> subtags
> (variant, region, etc.) and appending the extension would
> conform
> to this specification. At least I think that the current
> proposed
> fallback could confuse some of the developers.
>
> Naoto
>
> Doug Felt wrote:
>
> The way we've been thinking of it is that the
> extensions are
> an adjunct to the fields of the locale. Their order isn't
> important, so we canonicalize their order (the same is true
> for ldml keywords within the ldml extension).
>
> The model I have is as follows:
>
> The user passes a Locale to a service, which (usually)
> looks
> up a bundle, using the base fields language, script,
> region,
> and variant (including subfields of variant). Once a
> matching
> bundle is found, it's returned to the service. The
> appropriate extensions (in our particular case, the ldml
> extension) are then interpreted by the service when
> accessing
> the bundle-- which extensions are used depend on the
> service.
> There's no real hierarchical ordering to the ldml
> extensions,
> they're just different customizations that apply to
> whatever
> services care about them. This is different from
> language/script/region/variant where there is (generally) a
> useful hierarchical order.
>
> Part of the idea here is that by handling extensions
> separately, the service provider doesn't have to list
> them--
> and if there's no canonical order, it needs to list all
> permutations. For example, you might have keywords for
> collation, calendar, and number that apply to several
> locales.
> If there were no canonical order you'd have to support
> all 16
> permutations (zero, one, two, or three keywords, in any
> order)
> for each such locale. This is rather a lot to list in
> getAvailableLocales. And this doesn't even involve the
> values
> of the keywords.
>
> Even if there is a canonical order, then simple fallback
> doesn't work for all services. Say NumberFormat is
> passed a
> locale with the extension "th-th-u-ca-foobar-nu-thai"
> (calendar = foobar, numbers = thai). There's no locale
> matching that so this falls back to "th-th-u-ca-foobar",
> tossing the numbers extension on the floor. The problem
> appears when there is a bundle "th-th-u-nu-thai", the
> (irrelevant, from NumberFormat's point of view) request for
> the foobar calendar preempted the (relevant) request
> for thai
> numbers, since it was canonalized to a position earlier
> in the
> language tag. The exact opposite could happen for
> DateFormat
> with calendar = japanese and animal = foobar (as a
> hypothetical example).
>
> This leads to each service having to manipulate the
> extensions
> before looking up the bundle, to keep irrelevant extensions
> out of the way.
> This means each service is potentially seeing an entirely
> different bundle for the 'same' locale. If data used
> by both
> services is different between the two bundles, this
> might show
> up as an unwanted and unexpected side effect.
>
> Considerations like these led us to want to perform lookup
> using only the base locale, and let the services make
> use of
> the extension data as they saw fit based on that same
> bundle.
>
> As for BCP47, it does have descriptions of how one
> might match
> against a preferred language list, and also says that
> particular implementations can perform lookup ignoring
> extensions. But this language from the spec is
> generally in
> the context of matching a preferred language list with
> wildcards, and so it's not clear how or if this applies to
> examining a partially ordered collection of locale
> resources.
> I tend to think it does not directly apply, and that
> the way
> we propose handling lookup is conformant. Mark of
> course may
> have a different opinion.
>
> Doug
>
> On Fri, Mar 13, 2009 at 1:11 PM, Naoto Sato
> <Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>
> <mailto:Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>>
> <mailto:Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>
> <mailto:Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>>>> wrote:
>
> So are you specifically talking about LDML
> extensions? In
> BCP 47,
> it's one of the subtags and the BCP does not give
> any special
> semantics to it (because it does not know for what
> it would be
> used). So I thought the fallback would be:
>
> xx-yy-zz-ext
> xx-yy-zz
> xx-yy
> xx
>
>
> Thanks,
> Naoto
>
> Yoshito Umaoka wrote:
>
> Naoto Sato wrote:
>
> Umaoka-san,
>
> I don't think this is a compatibility issue,
> because the
> existing SPI implementations should still work
> compatible
> with the locales without extensions. Possible
> issue would
> only arise with the new locales.
>
> BTW, current SPI implementation invocation
> already
> involves fallback itself. i.e., say the request
> locale is
> xx_YY_foo_bar, and one SPI provider implements
> xx_YY, then
> that provider's service is used. So adding the
> extension
> fallback is not that ugly to me.
>
> Yes, I know the current fallback strategy.
> LDML extensions are designed for specifying optional
> behavior
> for a locale. Therefore, as we described in the
> very first
> proposal, extensions are carried in each level.
> More
> specifically, if a locale xx-yy-zzzz-u-cu-usd is
> requested,
> below is the candidate list.
>
> xx-yy-zzzz-u-usd
> xx-yy-u-usd
> xx-u-usd
>
> If we need "extensionless" version inserted, it
> becomes
>
> xx-yy-zzzz-u-usd
> xx-yy-zzzz
> xx-yy-u-usd
> xx-yy
> xx-u-usd
> xx
>
> Don't you think it's somewhat ugly?
>
> -Yoshito
>
>
>
> -- Naoto Sato
>
>
>
>
>
>
More information about the locale-enhancement-dev
mailing list