RFR (L) JDK-8195100: Use a low latency hashtable for SymbolTable

Gerard Ziemski gerard.ziemski at oracle.com
Thu Aug 2 15:26:53 UTC 2018


Thank you Ioi for taking a look, very much appreciated!

New webrev at http://cr.openjdk.java.net/~gziemski/8195100_rev3


> On Aug 1, 2018, at 4:27 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
> 
> Hi Gerard,
> 
> Have you tried running HelloWorld to get a sense of the start-up impact? E.g.:
> 
>     perf stat -r 100 ./old/bin/java -cp . HelloWorld > /dev/null
>     perf stat -r 100 ./new/bin/java -cp . HelloWorld > /dev/null

I have benchmarked it using our internal dev-submit (details in JDK-8208142), which shows up to 5% regression in simple startup benchmarks, but have not used Linux “perf”, and only briefly used Xcode’s “Instruments” to analyze it. I will be doing all of the performance work as part of JDK-8208142 (which I already started to work on)


> 
> I've looked at all the files except for the GC files. Here are my comments:
> 
> 
> symbolTable.hpp:
> 
>   87 // The symbol table holds all Symbol*s and corresponding interned strings.
>   88 // Symbol*s and literal strings should be canonicalized.
>   89 //
>   90 // The interned strings are created lazily.
>   91 //
>   92 // %note:
>   93 //  - symbolTableEntrys are allocated in blocks to reduce the space overhead.
>   94 template <class T, class N> class CompactHashtable;
>   95 class CompactSymbolTableWriter;
>   96 class SerializeClosure;
>   97
>   98 class SymbolTable;
>   99 class SymbolTableConfig;
>  100 typedef ConcurrentHashTable<Symbol*,
>  101                               SymbolTableConfig, mtSymbol> SymbolTableHash;
>  102
>  103 class SymbolTableCreateEntry;
>  104
>  105 class SymbolTable : public CHeapObj<mtSymbol> {
> 
> Line 98 can be deleted.

Done.

> 
> Lines 87~93 seem to be out of date and should be deleted:
>     - The fact that SymbolTable holds symbols should be obvious. Also, it
>       no longer holds all the symbols -- there's also the shared symbol table.
>     - Interned strings are not stored here.
>     - There's no definition of what "canonicalized" is.
>     - symbolTableEntrys no longer exists.

Done.

> 
> symbolTable.cpp:
> 
>  342   // Otherwise, add the symbol to table. Copy to a C string first.
>  343   char stack_buf[ON_STACK_BUFFER_LENGTH];
>  344   ResourceMark rm(THREAD);
>  345   if (len <= ON_STACK_BUFFER_LENGTH) {
>  346     buffer = stack_buf;
>  347   } else {
>  348     buffer = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD, char, len);
>  349   }
>  350   for (int i=0; i<len; i++) {
>  351     buffer[i] = name[i];
>  352   }
>  353   // Make sure there is no safepoint in the code above since name can't move.
>  354   // We can't include the code in NoSafepointVerifier because of the
>  355   // ResourceMark.
> 
> The above is dead code and should be removed.

That’s not dead code, but in fact you did find an issue here - we should be using the “buffer”, not “name” when we add to the table. Thanks!


> 
>  429 // Suggestion: Push unicode-based lookup all the way into the hashing
>  430 // and probing logic, so there is no need for convert_to_utf8 until
>  431 // an actual new Symbol* is created.
>  432 Symbol* SymbolTable::lookup_unicode(const jchar* name, int utf16_length, TRAPS) {
> 
> Do you have statistics of how often this function is called? I suspect it's
> pretty low for common apps.
> 
>  311 Symbol* SymbolTable::lookup(const char* name, int len, TRAPS) {
>  312   unsigned int hash = hash_symbol(name, len, SymbolTable::_alt_hash);
>  313   Symbol* sym = SymbolTable::the_table()->lookup_common(name, len, hash);
>  314   if (NULL == sym) {
>  315     sym = SymbolTable::the_table()->do_add_if_needed((char*)name, len, hash, true, CHECK_NULL);
>  316   }
>  ...
>  463 void SymbolTable::add(ClassLoaderData* loader_data, const constantPoolHandle& cp,
>  464                       int names_count, const char** names, int* lengths,
>  465                       int* cp_indices, unsigned int* hashValues, TRAPS) {
>  ...
>  471     Symbol* sym = SymbolTable::the_table()->lookup_common(name, len, hash);
>  472     if (NULL == sym) {
>  473       sym = SymbolTable::the_table()->do_add_if_needed(name, len, hash, c_heap, CHECK);
>  474     }
> 
> 
> Is it possible to combine the lookup_common and do_add_if_needed into a single operation,
> so you can avoid doing the search twice?

Very possible, and in fact I have already done this as part as JDK-8208142.

> 
>  647 // Sharing
>  648 #if INCLUDE_CDS
> 
> 647 can be deleted as it's superfluous.
> 

Done.

Thank you!


> 
> Thanks
> - Ioi
> 
> 
> 
> 
> On 7/30/18 9:19 AM, Gerard Ziemski wrote:
>> Please review this Enhancement, which uses the new concurrent hash table for SymbolTable.
>> 
>> This is an effort similar to the one behind JDK-8195097 "Make it possible to process StringTable outside safepoint” from a while ago.
>> 
>> The main expected goal here is to eliminate safepoint pauses needed to cleanup the table. This goal was achieved by using “Service Thread” to do the cleaning. Checking whether we need to clean is performed on “VM Thread”, after class unloading (we check the entire table). We also check the bucket into which we happen to be inserting a new item.
>> 
>> A few things to note:
>> 
>> - The SymbolTable implementation follows closely that of StringTable - we might be able to factor out common code later
>> 
>> - There are a few small, but statistically significant, regressions in startup benchmarks (around 2-5%) that will be addressed later, tracked by JDK-8208142
>> 
>> - There is an outstanding question about whether we can safely walk the table during a safepoint using do_scan, but without locking, tracked by JDK-8208462
>> 
>> - There is a cleanup opportunity presented now to remove rehashable hash table, tracked by JDK-8208519
>> 
>> - There is a new test that validates that we free dead entries when we insert new symbols (using short lived symbols via Class.forName() API)
>> 
>> 
>> Tested using Mach5 hs-tier1,2,3,4,5 (final test running right now…)
>> 
>> Webrev: http://cr.openjdk.java.net/~gziemski/8195100_rev1/
>> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8195100
>> 
>> 
>> Cheers
>> 
> 



More information about the hotspot-runtime-dev mailing list