RFR: 8309622: Re-examine the cache mechanism in BaseLocale [v3]

Naoto Sato naoto at openjdk.org
Fri Jun 16 17:22:04 UTC 2023


On Fri, 16 Jun 2023 12:46:01 GMT, Brett Okken <duke at openjdk.org> wrote:

>> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Addressing review comments
>
> src/java.base/share/classes/sun/util/locale/BaseLocale.java line 168:
> 
>> 166:         // BaseLocale as the key. The returned "normalized" instance
>> 167:         // can subsequently be used by the Locale instance which
>> 168:         // guarantees the locale components are properly cased/interned.
> 
> Is it truly important that the String values are interned, or is the desire simply that all refer to the same "canonical" String instance?
> my thought is that managing the canonical String instances could avoid WeakReferences and synchronized map access and just return a new BaseLocale each time.
> 
> 
> 
> 
>     public static BaseLocale getInstance(String language, String script,
>                                          String region, String variant) {
>         if (script == null || script.isEmpty()) {
>             script = "";
>         } else {
>             script = getCanonicalString(script, LocaleUtils::toTitleString);
>         }
> 
>         if (region == null || region.isEmpty()) {
>             region = "";
>         } else {
>             region = getCanonicalString(region, LocaleUtils::toUpperString);
>         }
> 
>         if (language == null || language.isEmpty()) {
>             language = "";
>         } else {
>             language = getCanonicalString(language, LocaleUtils::toLowerString);
>         }
>         
>         if (variant == null || variant.isEmpty()) {
>             variant = "";
>         } else {
>             variant = getCanonicalString(variant, Function.identity());
>         }
>         ...
>         return new BaseLocale(language, script, region, variant);
>     }
> 
> 
> 
>     private static final ConcurrentMap<String, String> CANONICALIZED_STRINGS = new ConcurrentHashMap<>();
> 
>     static {
>         // prime the old iso codes
>         CANONICALIZED_STRINGS.put("he", "he");
>         CANONICALIZED_STRINGS.put("iw", "iw");
>         CANONICALIZED_STRINGS.put("id", "id");
>         CANONICALIZED_STRINGS.put("in", "in");
>         CANONICALIZED_STRINGS.put("ji", "ji");
>         CANONICALIZED_STRINGS.put("yi", "yi");
>     }
>     
>     private static String getCanonicalString(String value, Function<String, String> conversion) {
>         String existing = CANONICALIZED_STRINGS.get(value);
>         if (existing != null) {
>             return existing;
>         }
> 
>         String converted = conversion.apply(value);
>         return CANONICALIZED_STRINGS.merge(converted, converted, (k,v) -> v);
>     }

Interning components in a Locale instance (through BaseLocale) is a long standing behavior. Changing it could break apps that might have relied on it (using `==` for comparison)

> src/java.base/share/classes/sun/util/locale/BaseLocale.java line 175:
> 
>> 173:                             LocaleUtils.toLowerString(b.getLanguage()).intern(),
>> 174:                             LocaleUtils.toTitleString(b.getScript()).intern(),
>> 175:                             LocaleUtils.toUpperString(b.getRegion()).intern(),
> 
> don't lines 147 and 148 above already guarantee the language and region have correct case?

147,148 are in `getInstance()` and the BaseLocale key here is created with a constructor, which simply copies the passed arguments. So normalization is needed here

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14404#discussion_r1232528918
PR Review Comment: https://git.openjdk.org/jdk/pull/14404#discussion_r1232528986


More information about the core-libs-dev mailing list