RFR: 8309622: Re-examine the cache mechanism in BaseLocale [v3]
Brett Okken
duke at openjdk.org
Fri Jun 16 12:49:06 UTC 2023
On Tue, 13 Jun 2023 17:56:57 GMT, Naoto Sato <naoto at openjdk.org> wrote:
>> This is stemming from the PR: https://github.com/openjdk/jdk/pull/14211 where aggressive GC can cause NPE in `BaseLocale$Key` class. I refactored the in-house cache with WeakHashMap, and removed the Key class as it is no longer needed (thus the original NPE will no longer be possible). Also with the new JMH test case, it gains some performance improvement:
>>
>> (w/o fix)
>>
>> Benchmark Mode Cnt Score Error Units
>> LocaleCache.testForLanguageTag avgt 20 5781.275 ± 569.580 ns/op
>> LocaleCache.testLocaleOf avgt 20 62564.079 ± 406.697 ns/op
>>
>> (w/ fix)
>> Benchmark Mode Cnt Score Error Units
>> LocaleCache.testForLanguageTag avgt 20 4801.175 ± 371.830 ns/op
>> LocaleCache.testLocaleOf avgt 20 60394.652 ± 352.471 ns/op
>
> Naoto Sato has updated the pull request incrementally with one additional commit since the last revision:
>
> Addressing review comments
src/java.base/share/classes/sun/util/locale/BaseLocale.java line 168:
> 166: // BaseLocale as the key. The returned "normalized" instance
> 167: // can subsequently be used by the Locale instance which
> 168: // guarantees the locale components are properly cased/interned.
Is it truly important that the String values are interned, or is the desire simply that all refer to the same "canonical" String instance?
my thought is that managing the canonical String instances could avoid WeakReferences and synchronized map access and just return a new BaseLocale each time.
public static BaseLocale getInstance(String language, String script,
String region, String variant) {
if (script == null) {
script = "";
} else {
script = getCanonicalString(script, LocaleUtils::toTitleString);
}
if (region == null) {
region = "";
} else {
region = getCanonicalString(region, LocaleUtils::toUpperString);
}
if (language == null) {
language = "";
} else {
language = getCanonicalString(language, LocaleUtils::toLowerString);
}
if (variant == null) {
variant = "";
} else {
variant = getCanonicalString(variant, Function.identity());
}
...
return new BaseLocale(language, script, region, variant);
}
private static final ConcurrentMap<String, String> CANONICALIZED_STRINGS = new ConcurrentHashMap<>();
static {
// prime the old iso codes
CANONICALIZED_STRINGS.put("he", "he");
CANONICALIZED_STRINGS.put("iw", "iw");
CANONICALIZED_STRINGS.put("id", "id");
CANONICALIZED_STRINGS.put("in", "in");
CANONICALIZED_STRINGS.put("ji", "ji");
CANONICALIZED_STRINGS.put("yi", "yi");
}
private static String getCanonicalString(String value, Function<String, String> conversion) {
String existing = CANONICALIZED_STRINGS.get(value);
if (existing != null) {
return existing;
}
String converted = conversion.apply(value);
return CANONICALIZED_STRINGS.merge(converted, converted, (k,v) -> v);
}
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14404#discussion_r1232205527
More information about the hotspot-gc-dev
mailing list