RFR: 8254598: StringDedupTable should use OopStorage

Coleen Phillimore coleenp at openjdk.java.net
Tue Apr 27 22:51:10 UTC 2021


On Fri, 23 Apr 2021 19:48:47 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

> Please review this change to the String Deduplication facility.  It is being
> changed to use OopStorage to hold weak references to relevant objects,
> rather than bespoke data structures requiring dedicated processing phases by
> supporting GCs.
> 
> (The Shenandoah update was contributed by Zhengyu Gu.)
> 
> This change significantly simplifies the interface between a given GC and
> the String Deduplication facility, which should make it much easier for
> other GCs to opt in.  However, this change does not alter the set of GCs
> providing support; currently only G1 and Shenandoah support string
> deduplication.  Adding support by other GCs will be in followup RFEs.
> 
> Reviewing via the diffs might not be all that useful for some parts, as
> several files have been essentially completely replaced, and a number of
> files have been added or eliminated.  The full webrev might be a better
> place to look.
> 
> The major changes are in gc/shared/stringdedup/* and in the supporting
> collectors, but there are also some smaller changes in other places, most
> notably classfile/{stringTable,javaClasses}.
> 
> This change is additionally labeled for review by core-libs, although there
> are no source changes there.  This change injects a byte field of bits into
> java.lang.String, using one of the previously unused padding bytes in that
> class.  (There were two unused bytes, now only one.)
> 
> Testing:
> mach5 tier1-2 with and without -XX:+UseStringDeduplication
> 
> Locally (linux-x64) ran all of the existing tests that use string
> deduplication with both G1 and Shenandoah.  Note that
> TestStringDeduplicationInterned.java is disabled for shenandoah, as it
> currently fails, for reasons I haven't figured out but suspect are test
> related.
> 
> - gc/stringdedup/   -- these used to be in gc/g1
> - runtime/cds/SharedStringsDedup.java
> - runtime/cds/appcds/cacheObject/DifferentHeapSizes.java
> - runtime/cds/appcds/sharedStrings/*
> 
> shenandoah-only:
> - gc/shenandoah/jvmti/TestHeapDump.java
> - gc/shenandoah/TestStringDedup.java
> - gc/shenandoah/TestStringDedupStress.java
> 
> Performance tested baseline, baseline + stringdedup, new with stringdedup,
> finding no significant differences.

I looked at the runtime code, which looks fine. I didn't read the GC code.

src/hotspot/share/classfile/javaClasses.inline.hpp line 77:

> 75: 
> 76: uint8_t* java_lang_String::flags_addr(oop java_string) {
> 77:   assert(_initialized, "Mut be initialized");

typo: must

src/hotspot/share/classfile/stringTable.cpp line 784:

> 782:   SharedStringIterator iter(f);
> 783:   _shared_table.iterate(&iter);
> 784: }

So with this change (somewhere below) do you no longer deduplicate strings from the shared string table? ie

// The CDS archive does not include the string deduplication table. Only the string
// table is saved in the archive. The shared strings from CDS archive need to be
// added to the string deduplication table before deduplication occurs. That is
// done in the beginning of the StringDedupThread (see StringDedupThread::do_deduplication()).
void StringDedupThread::deduplicate_shared_strings(StringDedupStat* stat) {
  StringDedupSharedClosure sharedStringDedup(stat);
  StringTable::shared_oops_do(&sharedStringDedup);
}
Asking @iklam to have a look then.

src/hotspot/share/runtime/globals.hpp line 1994:

> 1992:                                                                             \
> 1993:   product(uint64_t, StringDeduplicationHashSeed, 0, DIAGNOSTIC,             \
> 1994:           "Seed for the table hashing function; 0 requests computed seed")  \

Should these two really be develop() options?

src/hotspot/share/runtime/mutexLocker.cpp line 239:

> 237:     def(StringDedupQueue_lock      , PaddedMonitor, leaf,        true,  _safepoint_check_never);
> 238:     def(StringDedupTable_lock      , PaddedMutex  , leaf + 1,    true,  _safepoint_check_never);
> 239:   }

Why weren't these duplicate definitions?  This is disturbing.

-------------

Marked as reviewed by coleenp (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3662


More information about the core-libs-dev mailing list