RFR (L) JDK-8195100: Use a low latency hashtable for SymbolTable
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Fri Aug 3 00:02:10 UTC 2018
On 8/2/18 5:20 PM, Ioi Lam wrote:
>
>
> On 8/2/18 11:19 AM, Gerard Ziemski wrote:
>>> On Aug 2, 2018, at 12:30 PM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>>
>>> On 8/2/18 1:25 PM, Ioi Lam wrote:
>>>>
>>>> On 8/2/18 10:08 AM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> On 8/2/18 12:26 PM, Ioi Lam wrote:
>>>>>>
>>>>>> On 8/2/18 8:44 AM, Gerard Ziemski wrote:
>>>>>>>> On Aug 2, 2018, at 10:40 AM, Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/2/18 8:26 AM, Gerard Ziemski wrote:
>>>>>>>>>> symbolTable.cpp:
>>>>>>>>>>
>>>>>>>>>> 342 // Otherwise, add the symbol to table. Copy to a C
>>>>>>>>>> string first.
>>>>>>>>>> 343 char stack_buf[ON_STACK_BUFFER_LENGTH];
>>>>>>>>>> 344 ResourceMark rm(THREAD);
>>>>>>>>>> 345 if (len <= ON_STACK_BUFFER_LENGTH) {
>>>>>>>>>> 346 buffer = stack_buf;
>>>>>>>>>> 347 } else {
>>>>>>>>>> 348 buffer = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD,
>>>>>>>>>> char, len);
>>>>>>>>>> 349 }
>>>>>>>>>> 350 for (int i=0; i<len; i++) {
>>>>>>>>>> 351 buffer[i] = name[i];
>>>>>>>>>> 352 }
>>>>>>>>>> 353 // Make sure there is no safepoint in the code above
>>>>>>>>>> since name can't move.
>>>>>>>>>> 354 // We can't include the code in NoSafepointVerifier
>>>>>>>>>> because of the
>>>>>>>>>> 355 // ResourceMark.
>>>>>>>>>>
>>>>>>>>>> The above is dead code and should be removed.
>>>>>>>>> That’s not dead code, but in fact you did find an issue here -
>>>>>>>>> we should be using the “buffer”, not “name” when we add to the
>>>>>>>>> table. Thanks!
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Why do you need to make a copy here? buffer and name seems to
>>>>>>>> have the exact same contents.
>>>>>>> I didn’t write that code, but I assume the name comes in some
>>>>>>> temp memory that can go away at any time, so we need a solid copy.
>>>>>>>
>>>>>>>
>>>>>>> cheers
>>>>>>>
>>>>>> This function is called only from two places. In both cases, the
>>>>>> input Symbol won't go away
>>>>>>
>>>>>> Symbol* SignatureStream::as_symbol(TRAPS) {
>>>>>> // Create a symbol from for string _begin _end
>>>>>> int begin = _begin;
>>>>>> int end = _end;
>>>>>>
>>>>>> if ( _signature->byte_at(_begin) == 'L'
>>>>>> && _signature->byte_at(_end-1) == ';') {
>>>>>> begin++;
>>>>>> end--;
>>>>>> }
>>>>>>
>>>>>> // Save names for cleaning up reference count at the end of
>>>>>> // SignatureStream scope.
>>>>>> Symbol* name = SymbolTable::new_symbol(_signature, begin, end,
>>>>>> CHECK_NULL);
>>>>>> _names->push(name); // save new symbol for decrementing later
>>>>>> return name;
>>>>>> }
>>>>>>
>>>>>> Symbol* ClassVerifier::create_temporary_symbol(const Symbol *s,
>>>>>> int begin,
>>>>>> int end, TRAPS) {
>>>>>> Symbol* sym = SymbolTable::new_symbol(s, begin, end, CHECK_NULL);
>>>>>> _symbols->push(sym);
>>>>>> return sym;
>>>>>> }
>>>>>>
>>>>>> Also, the "name" was used just a few lines above in the lookup,
>>>>>> so if this was ever a issue, that lookup would fail as well.
>>>>>>
>>>>>> Mercurial history shows that this function has not had any
>>>>>> meaningful change since it was imported from SCCS in 2007, so I
>>>>>> think it's just a case of bit rot and the dead code can be safely
>>>>>> removed.
>>>>> I don't like the concept of one Symbol pointing into a substring
>>>>> of another Symbol. If the latter goes away, then the former will
>>>>> point to deallocated memory. Can we examine this outside this
>>>>> change to see if the lifetime of the substring Symbol is
>>>>> completely contained in the superstring Symbol?
>>>>>
>>>> The Symbol that's eventually created stores the characters inside
>>>> its body. It doesn't not "point to another Symbol". The buffer is
>>>> used only during creation time.
>>>>
>>>> There are several forms of SymbolTable::new_symbol() calls. Some of
>>>> them pass in an arbitrary "char*" anyway ... so there's no need to
>>>> treat SymbolTable::new_symbol(Symbol*, ...) specially.
>>> Oh, if this is the case, I'm fine with removing it then.
>> The function simplified very nicely into:
>>
>> Symbol* SymbolTable::lookup(const Symbol* sym, int begin, int end,
>> TRAPS) {
>> Symbol* found = NULL;
>> {
>> debug_only(NoSafepointVerifier nsv;)
>>
>> char* name = (char*)sym->base() + begin;
>> int len = end - begin;
>> unsigned int hash = hash_symbol(name, len, SymbolTable::_alt_hash);
>> found = SymbolTable::the_table()->do_add_if_needed(name, len,
>> hash, true, THREAD);
>> }
>> return found;
>> }
>>
>> I like it, thanks.
>>
>> New webrev at http://cr.openjdk.java.net/~gziemski/8195100_rev4
> Looks good! I think the extra braces and the local variable "found"
> can also be deleted.
>
> Also, I think the NoSafepointVerifier can be removed as well. "sym"
> used to live in the heap, so when using a pointer that points inside
> it, you want to make sure no safepoints are entered, or else the "sym"
> may be moved and "name" is no longer valid.
>
> However, not that Symbols are metaspace objects and don't move
> anymore, NoSafepointVerifier is no longer needed here.
>
> Coleen, could you confirm?
Yes, can confirm. It doesn't move anymore.
Coleen
>
>
> Thanks
> - Ioi
>
>>
>> cheers
>>
>
More information about the hotspot-runtime-dev
mailing list