RFR (L) JDK-8195100: Use a low latency hashtable for SymbolTable

Ioi Lam ioi.lam at oracle.com
Thu Aug 2 21:20:00 UTC 2018



On 8/2/18 11:19 AM, Gerard Ziemski wrote:
>> On Aug 2, 2018, at 12:30 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>>
>> On 8/2/18 1:25 PM, Ioi Lam wrote:
>>>
>>> On 8/2/18 10:08 AM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> On 8/2/18 12:26 PM, Ioi Lam wrote:
>>>>>
>>>>> On 8/2/18 8:44 AM, Gerard Ziemski wrote:
>>>>>>> On Aug 2, 2018, at 10:40 AM, Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 8/2/18 8:26 AM, Gerard Ziemski wrote:
>>>>>>>>> symbolTable.cpp:
>>>>>>>>>
>>>>>>>>>    342   // Otherwise, add the symbol to table. Copy to a C string first.
>>>>>>>>>    343   char stack_buf[ON_STACK_BUFFER_LENGTH];
>>>>>>>>>    344   ResourceMark rm(THREAD);
>>>>>>>>>    345   if (len <= ON_STACK_BUFFER_LENGTH) {
>>>>>>>>>    346     buffer = stack_buf;
>>>>>>>>>    347   } else {
>>>>>>>>>    348     buffer = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD, char, len);
>>>>>>>>>    349   }
>>>>>>>>>    350   for (int i=0; i<len; i++) {
>>>>>>>>>    351     buffer[i] = name[i];
>>>>>>>>>    352   }
>>>>>>>>>    353   // Make sure there is no safepoint in the code above since name can't move.
>>>>>>>>>    354   // We can't include the code in NoSafepointVerifier because of the
>>>>>>>>>    355   // ResourceMark.
>>>>>>>>>
>>>>>>>>> The above is dead code and should be removed.
>>>>>>>> That’s not dead code, but in fact you did find an issue here - we should be using the “buffer”, not “name” when we add to the table. Thanks!
>>>>>>>>
>>>>>>>>
>>>>>>> Why do you need to make a copy here? buffer and name seems to have the exact same contents.
>>>>>> I didn’t write that code, but I assume the name comes in some temp memory that can go away at any time, so we need a solid copy.
>>>>>>
>>>>>>
>>>>>> cheers
>>>>>>
>>>>> This function is called only from two places. In both cases, the input Symbol won't go away
>>>>>
>>>>> Symbol* SignatureStream::as_symbol(TRAPS) {
>>>>>    // Create a symbol from for string _begin _end
>>>>>    int begin = _begin;
>>>>>    int end   = _end;
>>>>>
>>>>>    if (   _signature->byte_at(_begin) == 'L'
>>>>>        && _signature->byte_at(_end-1) == ';') {
>>>>>      begin++;
>>>>>      end--;
>>>>>    }
>>>>>
>>>>>    // Save names for cleaning up reference count at the end of
>>>>>    // SignatureStream scope.
>>>>>    Symbol* name = SymbolTable::new_symbol(_signature, begin, end, CHECK_NULL);
>>>>>    _names->push(name);  // save new symbol for decrementing later
>>>>>    return name;
>>>>> }
>>>>>
>>>>> Symbol* ClassVerifier::create_temporary_symbol(const Symbol *s, int begin,
>>>>>                                                 int end, TRAPS) {
>>>>>    Symbol* sym = SymbolTable::new_symbol(s, begin, end, CHECK_NULL);
>>>>>    _symbols->push(sym);
>>>>>    return sym;
>>>>> }
>>>>>
>>>>> Also, the "name" was used just a few lines above in the lookup, so if this was ever a issue, that lookup would fail as well.
>>>>>
>>>>> Mercurial history shows that this function has not had any meaningful change since it was imported from SCCS in 2007, so I think it's just a case of bit rot and the dead code can be safely removed.
>>>> I don't like the concept of one Symbol pointing into a substring of another Symbol.  If the latter goes away, then the former will point to deallocated memory.   Can we examine this outside this change to see if the lifetime of the substring Symbol is completely contained in the superstring Symbol?
>>>>
>>> The Symbol that's eventually created stores the characters inside its body. It doesn't not "point to another Symbol". The buffer is used only during creation time.
>>>
>>> There are several forms of SymbolTable::new_symbol() calls. Some of them pass in an arbitrary "char*" anyway ... so there's no need to treat SymbolTable::new_symbol(Symbol*, ...) specially.
>> Oh, if this is the case, I'm fine with removing it then.
> The function simplified very nicely into:
>
> Symbol* SymbolTable::lookup(const Symbol* sym, int begin, int end, TRAPS) {
>   Symbol* found = NULL;
>   {
>     debug_only(NoSafepointVerifier nsv;)
>
>     char* name = (char*)sym->base() + begin;
>     int len = end - begin;
>     unsigned int hash = hash_symbol(name, len, SymbolTable::_alt_hash);
>     found = SymbolTable::the_table()->do_add_if_needed(name, len, hash, true, THREAD);
>   }
>   return found;
> }
>
> I like it, thanks.
>
> New webrev at http://cr.openjdk.java.net/~gziemski/8195100_rev4
Looks good! I think the extra braces and the local variable "found" can 
also be deleted.

Also, I think the NoSafepointVerifier can be removed as well. "sym" used 
to live in the heap, so when using a pointer that points inside it, you 
want to make sure no safepoints are entered, or else the "sym" may be 
moved and "name" is no longer valid.

However, not that Symbols are metaspace objects and don't move anymore, 
NoSafepointVerifier is no longer needed here.

Coleen, could you confirm?


Thanks
- Ioi

>
> cheers
>



More information about the hotspot-runtime-dev mailing list