RFR: 8254598: StringDedupTable should use OopStorage

Ioi Lam iklam at openjdk.java.net
Wed Apr 28 00:53:53 UTC 2021


On Fri, 23 Apr 2021 19:48:47 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to the String Deduplication facility.  It is being
> changed to use OopStorage to hold weak references to relevant objects,
> rather than bespoke data structures requiring dedicated processing phases by
> supporting GCs.
> 
> (The Shenandoah update was contributed by Zhengyu Gu.)
> 
> This change significantly simplifies the interface between a given GC and
> the String Deduplication facility, which should make it much easier for
> other GCs to opt in.  However, this change does not alter the set of GCs
> providing support; currently only G1 and Shenandoah support string
> deduplication.  Adding support by other GCs will be in followup RFEs.
> 
> Reviewing via the diffs might not be all that useful for some parts, as
> several files have been essentially completely replaced, and a number of
> files have been added or eliminated.  The full webrev might be a better
> place to look.
> 
> The major changes are in gc/shared/stringdedup/* and in the supporting
> collectors, but there are also some smaller changes in other places, most
> notably classfile/{stringTable,javaClasses}.
> 
> This change is additionally labeled for review by core-libs, although there
> are no source changes there.  This change injects a byte field of bits into
> java.lang.String, using one of the previously unused padding bytes in that
> class.  (There were two unused bytes, now only one.)
> 
> Testing:
> mach5 tier1-2 with and without -XX:+UseStringDeduplication
> 
> Locally (linux-x64) ran all of the existing tests that use string
> deduplication with both G1 and Shenandoah.  Note that
> TestStringDeduplicationInterned.java is disabled for shenandoah, as it
> currently fails, for reasons I haven't figured out but suspect are test
> related.
> 
> - gc/stringdedup/   -- these used to be in gc/g1
> - runtime/cds/SharedStringsDedup.java
> - runtime/cds/appcds/cacheObject/DifferentHeapSizes.java
> - runtime/cds/appcds/sharedStrings/*
> 
> shenandoah-only:
> - gc/shenandoah/jvmti/TestHeapDump.java
> - gc/shenandoah/TestStringDedup.java
> - gc/shenandoah/TestStringDedupStress.java
> 
> Performance tested baseline, baseline + stringdedup, new with stringdedup,
> finding no significant differences.

Changes requested by iklam (Reviewer).

src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp line 557:

> 555:     // non-latin1, and deduplicating if we find a match.  For deduplication we
> 556:     // only care if the arrays consist of the same sequence of bytes.
> 557:     const jchar* chars = static_cast<jchar*>(value->base(T_CHAR));

The encoding of the shared strings always match CompactStrings. Otherwise the CDS archive will fail to map. E.g.,


$ java -XX:-CompactStrings -Xshare:dump 
$ java -XX:+CompactStrings -Xlog:cds -version
...
[0.032s][info][cds] UseSharedSpaces: The shared archive file's CompactStrings 
                    setting (disabled) does not equal the current CompactStrings setting (enabled).
[0.032s][info][cds] UseSharedSpaces: Unable to map shared spaces
...


So you can avoid this `if` block when `CompactStrings == true`. The `!java_lang_String::is_latin1(found)` check below can be changed to an assert.

Also, I think we need an explicit test case where you'd return `true` at line 564. I can write a new test case and email to you. I think it will involve dumping an archive with `-XX:-CompactStrings`.

-------------

PR: https://git.openjdk.java.net/jdk/pull/3662


More information about the core-libs-dev mailing list