[loc-en-dev] Equality of base locale and LocaleServiceProvider implementation

Yoshito Umaoka y.umaoka at gmail.com
Tue Mar 17 11:22:59 PDT 2009


I think Naoto mentioned the interaction between Java and a LSP.  The 
lookup is done by base locale which is under Java's control.  Once Java 
found a LSP which support the base locale, then call a service method 
with extensions first.  If null is returned, then call the same provider 
with the base locale only.  So conceptually, look up and service 
invocation are not a single fallback chain.

More specifically,

1. Requested locale is xx-yy-zzzz-ext
2. Java try to locate a service claiming to support xx-yy-zzzz
3. Java invoke the service method with xx-yy-zzzz-ext
4. If 3 is failed (null is returned), then call the same service method 
with xx-yy-zzzz
5. If 4 is failed (still returning null), then Java to try xx-yy and 
start over the steps 2 to 4, then xx.

One thing I'm not sure (and it was my original question) is if we really 
need step 4 above.

-Yoshito

Doug Felt wrote:
> I don't understand this.
>
> If providers only advertise their base locales, why is the extension 
> involved in lookup at all?
>
> Doug
>
> On Tue, Mar 17, 2009 at 11:07 AM, Naoto Sato <Naoto.Sato at sun.com 
> <mailto:Naoto.Sato at sun.com>> wrote:
>
>     If handling the extension is special and it's our discretion how
>     to deal with it, then I would think the following fallback is the
>     most compatible (for the reason Yoshito mentioned) and still meets
>     the requirement from -u extension for LDML keywords, assuming that
>     the providers only advertise their base locales.
>
>     xx-yy-zzzz-ext
>     xx-yy-zzzz
>     xx-yy-ext
>     xx-yy
>     xx-ext
>     xx
>
>     Thanks,
>     Naoto
>
>     Doug Felt wrote:
>
>         Well, perhaps Mark can clarify this passage for us  Mark?
>
>         As you cite:
>
>         "However, an implementation MAY remove these [extensions and
>         unrecognized private-use subtags] from ranges prior to
>         performing the lookup, provided the implementation also
>         removes them from the tags being compared"
>
>         This seem to me to allow us to compare only the initial fields
>         when doing lookup, while still using the full locale after
>         lookup has been completed.
>
>         Naoto, how would you propose dealing with the problems I cited?
>
>         Doug
>
>         On Mon, Mar 16, 2009 at 11:28 AM, Naoto Sato
>         <Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>
>         <mailto:Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>>> wrote:
>
>            Well, I understand the rationale for the LDML keywords.
>          However
>            on the other hand, BCP 47 specifies the look up fallback as I
>            described before (RFC4647, "3.4. Lookup").  Regarding the
>            extensions, it reads:
>
>            ---
>
>             Extensions and unrecognized private-use subtags might be
>         unrelated to
>             a particular application of lookup.  Since these subtags
>         come at the
>             end of the subtag sequence, they are removed first during the
>             fallback process and usually pose no barrier to
>         interoperability.
>             However, an implementation MAY remove these from ranges
>         prior to
>             performing the lookup (provided the implementation also
>         removes them
>             from the tags being compared).  Such modification is
>         internal to the
>             implementation and applications, protocols, or
>         specifications SHOULD
>             NOT remove or modify subtags in content that they return
>         or forward,
>             because this removes information that can be used elsewhere.
>
>            ---
>
>            So this expects that the extensions should first be removed
>         when
>            fallback happens.  I am not sure whether removing the left
>         subtags
>            (variant, region, etc.) and appending the extension would
>         conform
>            to this specification.  At least I think that the current
>         proposed
>            fallback could confuse some of the developers.
>
>            Naoto
>
>            Doug Felt wrote:
>
>                The way we've been thinking of it is that the
>         extensions are
>                an adjunct to the fields of the locale.  Their order isn't
>                important, so we canonicalize their order (the same is true
>                for ldml keywords within the ldml extension).
>
>                The model I have is as follows:
>
>                The user passes a Locale to a service, which (usually)
>         looks
>                up a bundle, using the base fields language, script,
>         region,
>                and variant (including subfields of variant).  Once a
>         matching
>                bundle is found, it's returned to the service.  The
>                appropriate extensions (in our particular case, the ldml
>                extension) are then interpreted by the service when
>         accessing
>                the bundle-- which extensions are used depend on the
>         service.
>                 There's no real hierarchical ordering to the ldml
>         extensions,
>                they're just different customizations that apply to
>         whatever
>                services care about them.  This is different from
>                language/script/region/variant where there is (generally) a
>                useful hierarchical order.
>
>                Part of the idea here is that by handling extensions
>                separately, the service provider doesn't have to list
>         them--
>                and if there's no canonical order, it needs to list all
>                permutations.  For example, you might have keywords for
>                collation, calendar, and number that apply to several
>         locales.
>                 If there were no canonical order you'd have to support
>         all 16
>                permutations (zero, one, two, or three keywords, in any
>         order)
>                for each such locale.  This is rather a lot to list in
>                getAvailableLocales.  And this doesn't even involve the
>         values
>                of the keywords.
>
>                Even if there is a canonical order, then simple fallback
>                doesn't work for all services.  Say NumberFormat is
>         passed a
>                locale with the extension "th-th-u-ca-foobar-nu-thai"
>                (calendar = foobar, numbers = thai).  There's no locale
>                matching that  so this falls back to "th-th-u-ca-foobar",
>                tossing the numbers extension on the floor.  The problem
>                appears when there is a bundle "th-th-u-nu-thai", the
>                (irrelevant, from NumberFormat's point of view) request for
>                the foobar calendar preempted the (relevant) request
>         for thai
>                numbers, since it was canonalized to a position earlier
>         in the
>                language tag.  The exact opposite could happen for
>         DateFormat
>                with calendar = japanese and animal = foobar (as a
>                hypothetical example).
>
>                This leads to each service having to manipulate the
>         extensions
>                before looking up the bundle, to keep irrelevant extensions
>                out of the way.
>                This means each service is potentially seeing an entirely
>                different bundle for the 'same' locale.  If data used
>         by both
>                services is different between the two bundles, this
>         might show
>                up as an unwanted and unexpected side effect.
>
>                Considerations like these led us to want to perform lookup
>                using only the base locale, and let the services make
>         use of
>                the extension data as they saw fit based on that same
>         bundle.
>
>                As for BCP47, it does have descriptions of how one
>         might match
>                against a preferred language list, and also says that
>                particular implementations can perform lookup ignoring
>                extensions.  But this language from the spec is
>         generally in
>                the context of matching a preferred language list with
>                wildcards, and so it's not clear how or if this applies to
>                examining a partially ordered collection of locale
>         resources.
>                  I tend to think it does not directly apply, and that
>         the way
>                we propose handling lookup is conformant.  Mark of
>         course may
>                have a different opinion.
>
>                Doug
>
>                On Fri, Mar 13, 2009 at 1:11 PM, Naoto Sato
>                <Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>
>         <mailto:Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>>
>                <mailto:Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>
>         <mailto:Naoto.Sato at sun.com <mailto:Naoto.Sato at sun.com>>>> wrote:
>
>                   So are you specifically talking about LDML
>         extensions?  In
>                BCP 47,
>                   it's one of the subtags and the BCP does not give
>         any special
>                   semantics to it (because it does not know for what
>         it would be
>                   used).  So I thought the fallback would be:
>
>                   xx-yy-zz-ext
>                   xx-yy-zz
>                   xx-yy
>                   xx
>
>
>                   Thanks,
>                   Naoto
>
>                   Yoshito Umaoka wrote:
>
>                       Naoto Sato wrote:
>
>                           Umaoka-san,
>
>                           I don't think this is a compatibility issue,
>                because the
>                           existing SPI implementations should still work
>                compatible
>                           with the locales without extensions.  Possible
>                issue would
>                           only arise with the new locales.
>
>                           BTW, current SPI implementation invocation
>         already
>                           involves fallback itself. i.e., say the request
>                locale is
>                           xx_YY_foo_bar, and one SPI provider implements
>                xx_YY, then
>                           that provider's service is used.  So adding the
>                extension
>                           fallback is not that ugly to me.
>
>                       Yes, I know the current fallback strategy.
>                       LDML extensions are designed for specifying optional
>                behavior
>                       for a locale.  Therefore, as we described in the
>         very first
>                       proposal, extensions are carried in each level.
>          More
>                       specifically, if a locale xx-yy-zzzz-u-cu-usd is
>         requested,
>                       below is the candidate list.
>
>                       xx-yy-zzzz-u-usd
>                       xx-yy-u-usd
>                       xx-u-usd
>
>                       If we need "extensionless" version inserted, it
>         becomes
>
>                       xx-yy-zzzz-u-usd
>                       xx-yy-zzzz
>                       xx-yy-u-usd
>                       xx-yy
>                       xx-u-usd
>                       xx
>
>                       Don't you think it's somewhat ugly?
>
>                       -Yoshito
>
>
>
>                   --    Naoto Sato
>
>
>
>
>
>




More information about the locale-enhancement-dev mailing list