RFR (L) JDK-8195100: Use a low latency hashtable for SymbolTable

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Fri Aug 3 00:02:10 UTC 2018



On 8/2/18 5:20 PM, Ioi Lam wrote:
>
>
> On 8/2/18 11:19 AM, Gerard Ziemski wrote:
>>> On Aug 2, 2018, at 12:30 PM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>>
>>> On 8/2/18 1:25 PM, Ioi Lam wrote:
>>>>
>>>> On 8/2/18 10:08 AM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> On 8/2/18 12:26 PM, Ioi Lam wrote:
>>>>>>
>>>>>> On 8/2/18 8:44 AM, Gerard Ziemski wrote:
>>>>>>>> On Aug 2, 2018, at 10:40 AM, Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/2/18 8:26 AM, Gerard Ziemski wrote:
>>>>>>>>>> symbolTable.cpp:
>>>>>>>>>>
>>>>>>>>>>    342   // Otherwise, add the symbol to table. Copy to a C 
>>>>>>>>>> string first.
>>>>>>>>>>    343   char stack_buf[ON_STACK_BUFFER_LENGTH];
>>>>>>>>>>    344   ResourceMark rm(THREAD);
>>>>>>>>>>    345   if (len <= ON_STACK_BUFFER_LENGTH) {
>>>>>>>>>>    346     buffer = stack_buf;
>>>>>>>>>>    347   } else {
>>>>>>>>>>    348     buffer = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD, 
>>>>>>>>>> char, len);
>>>>>>>>>>    349   }
>>>>>>>>>>    350   for (int i=0; i<len; i++) {
>>>>>>>>>>    351     buffer[i] = name[i];
>>>>>>>>>>    352   }
>>>>>>>>>>    353   // Make sure there is no safepoint in the code above 
>>>>>>>>>> since name can't move.
>>>>>>>>>>    354   // We can't include the code in NoSafepointVerifier 
>>>>>>>>>> because of the
>>>>>>>>>>    355   // ResourceMark.
>>>>>>>>>>
>>>>>>>>>> The above is dead code and should be removed.
>>>>>>>>> That’s not dead code, but in fact you did find an issue here - 
>>>>>>>>> we should be using the “buffer”, not “name” when we add to the 
>>>>>>>>> table. Thanks!
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Why do you need to make a copy here? buffer and name seems to 
>>>>>>>> have the exact same contents.
>>>>>>> I didn’t write that code, but I assume the name comes in some 
>>>>>>> temp memory that can go away at any time, so we need a solid copy.
>>>>>>>
>>>>>>>
>>>>>>> cheers
>>>>>>>
>>>>>> This function is called only from two places. In both cases, the 
>>>>>> input Symbol won't go away
>>>>>>
>>>>>> Symbol* SignatureStream::as_symbol(TRAPS) {
>>>>>>    // Create a symbol from for string _begin _end
>>>>>>    int begin = _begin;
>>>>>>    int end   = _end;
>>>>>>
>>>>>>    if (   _signature->byte_at(_begin) == 'L'
>>>>>>        && _signature->byte_at(_end-1) == ';') {
>>>>>>      begin++;
>>>>>>      end--;
>>>>>>    }
>>>>>>
>>>>>>    // Save names for cleaning up reference count at the end of
>>>>>>    // SignatureStream scope.
>>>>>>    Symbol* name = SymbolTable::new_symbol(_signature, begin, end, 
>>>>>> CHECK_NULL);
>>>>>>    _names->push(name);  // save new symbol for decrementing later
>>>>>>    return name;
>>>>>> }
>>>>>>
>>>>>> Symbol* ClassVerifier::create_temporary_symbol(const Symbol *s, 
>>>>>> int begin,
>>>>>>                                                 int end, TRAPS) {
>>>>>>    Symbol* sym = SymbolTable::new_symbol(s, begin, end, CHECK_NULL);
>>>>>>    _symbols->push(sym);
>>>>>>    return sym;
>>>>>> }
>>>>>>
>>>>>> Also, the "name" was used just a few lines above in the lookup, 
>>>>>> so if this was ever a issue, that lookup would fail as well.
>>>>>>
>>>>>> Mercurial history shows that this function has not had any 
>>>>>> meaningful change since it was imported from SCCS in 2007, so I 
>>>>>> think it's just a case of bit rot and the dead code can be safely 
>>>>>> removed.
>>>>> I don't like the concept of one Symbol pointing into a substring 
>>>>> of another Symbol.  If the latter goes away, then the former will 
>>>>> point to deallocated memory.   Can we examine this outside this 
>>>>> change to see if the lifetime of the substring Symbol is 
>>>>> completely contained in the superstring Symbol?
>>>>>
>>>> The Symbol that's eventually created stores the characters inside 
>>>> its body. It doesn't not "point to another Symbol". The buffer is 
>>>> used only during creation time.
>>>>
>>>> There are several forms of SymbolTable::new_symbol() calls. Some of 
>>>> them pass in an arbitrary "char*" anyway ... so there's no need to 
>>>> treat SymbolTable::new_symbol(Symbol*, ...) specially.
>>> Oh, if this is the case, I'm fine with removing it then.
>> The function simplified very nicely into:
>>
>> Symbol* SymbolTable::lookup(const Symbol* sym, int begin, int end, 
>> TRAPS) {
>>   Symbol* found = NULL;
>>   {
>>     debug_only(NoSafepointVerifier nsv;)
>>
>>     char* name = (char*)sym->base() + begin;
>>     int len = end - begin;
>>     unsigned int hash = hash_symbol(name, len, SymbolTable::_alt_hash);
>>     found = SymbolTable::the_table()->do_add_if_needed(name, len, 
>> hash, true, THREAD);
>>   }
>>   return found;
>> }
>>
>> I like it, thanks.
>>
>> New webrev at http://cr.openjdk.java.net/~gziemski/8195100_rev4
> Looks good! I think the extra braces and the local variable "found" 
> can also be deleted.
>
> Also, I think the NoSafepointVerifier can be removed as well. "sym" 
> used to live in the heap, so when using a pointer that points inside 
> it, you want to make sure no safepoints are entered, or else the "sym" 
> may be moved and "name" is no longer valid.
>
> However, not that Symbols are metaspace objects and don't move 
> anymore, NoSafepointVerifier is no longer needed here.
>
> Coleen, could you confirm?

Yes, can confirm. It doesn't move anymore.
Coleen
>
>
> Thanks
> - Ioi
>
>>
>> cheers
>>
>



More information about the hotspot-runtime-dev mailing list