RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Thu Jul 19 02:50:17 UTC 2018



On 7/18/18 6:35 PM, Ioi Lam wrote:
>
>
> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 7/18/18 5:14 PM, Ioi Lam wrote:
>>> Hi Coleen,
>>>
>>> The changes look good! The new operations on _length_and_refcount 
>>> are much cleaner than my old ATOMIC_SHORT_PAIR hack.
>>
>> Yes, this makes more sense to me.
>>>
>>> symbolTable.cpp:
>>>
>>>  SymbolTable::lookup_dynamic() {
>>>  ...
>>>  214       Symbol* sym = e->literal();
>>>  215       if (sym->equals(name, len) && 
>>> sym->try_increment_refcount()) {
>>>  216         // something is referencing this symbol now.
>>>  217         return sym;
>>>  218       }
>>>
>>>
>>> symbol.cpp:
>>>
>>>  221 void Symbol::increment_refcount() {
>>>  222   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>  223     if (!try_increment_refcount()) {
>>>  224 #ifdef ASSERT
>>>  225       print();
>>>  226 #endif
>>>  227       fatal("refcount has gone to zero");
>>>  228     }
>>>  229     NOT_PRODUCT(Atomic::inc(&_total_count);)
>>>  230   }
>>>  231 }
>>>
>>>  246 // Atomically increment while checking for zero, zero is bad.
>>>  247 bool Symbol::try_increment_refcount() {
>>>  248   uint32_t old_value = _length_and_refcount;  // fetch once
>>>  249   int refc = extract_refcount(old_value);
>>>  250
>>>  251   if (refc == PERM_REFCOUNT) {
>>>  252     return true;
>>>  253   } else if (refc == 0) {
>>>  254     return false; // effectively dead, can't revive
>>>  255   }
>>>  256
>>>  257   uint32_t now;
>>>  258   while ((now = Atomic::cmpxchg(old_value + 1, 
>>> &_length_and_refcount, old_value)) != old_value) {
>>>  259     // failed to increment, check refcount again.
>>>  260     refc = extract_refcount(now);
>>>  261     if (refc == 0) {
>>>  262       return false; // just died
>>>  263     } else if (refc == PERM_REFCOUNT) {
>>>  264       return true; // just became permanent
>>>  265     }
>>>  266     old_value = now; // refcount changed, try again
>>>  267   }
>>>  268   return true;
>>>  269 }
>>>
>>>
>>> So is it valid for Symbol::try_increment_refcount() to return false? 
>>> SymbolTable::lookup_dynamic() seems to suggest YES, but 
>>> Symbol::increment_refcount() seems to suggest NO.
>>
>> True.  If you are looking up a symbol and someone other thread has 
>> decremented the refcount to zero, this symbol should not be 
>> returned.  My test exercises this code even without the concurrent 
>> hashtable.  When the hashtable is concurrent, a zero-ed Symbol could 
>> be deallocated so we don't want to return it.
>>
> I think the following should be added as a comment in 
> increment_refcount().
>> In the case where you call increment_refcount() not during lookup, it 
>> is assumed that you have a symbol with a non-zero refcount and it 
>> can't go away while you are holding it.

Ok, added.
>
>>>
>>> If it's always an invalid condition, I think the fatal() should be 
>>> moved inside try_increment_refcount.
>>>
>>
>> It isn't fatal at lookup.  The lookup must skip a zero-ed entry.
>>> Otherwise, I think you need to add comments in all 3 places, to say 
>>> when it's possible to get a 0 refcount, and when it's not. And, it 
>>> might be worth expanding on why "zero is bad" :-)
>>
>> How about this comment to try_increment_refcount:
>>
>> // Increment refcount while checking for zero.  If the Symbol's 
>> refcount becomes zero
>> // a thread could be concurrently removing the Symbol.  This is used 
>> during SymbolTable
>> // lookup to avoid reviving a dead Symbol.
> Sounds good.

Thanks, Ioi.
Coleen
>
> Thanks
> - Ioi
>
>>>
>>> My guess is:
>>> + if you're doing a lookup, you might be seeing Symbols that have 
>>> already been marked for deletion, which is indicated by a 0 
>>> refcount. You want to skip such Symbols.
>>>
>>> + if you're incrementing the refcount, that means you're holding a 
>>> valid Symbol, which means this Symbol should have never been marked 
>>> for deletion.
>>>
>>> Is this correct?
>>
>> Yes, both true.
>>
>> Thanks,
>> Coleen
>>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Gerard, thank you for the code review.
>>>>
>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote:
>>>>> Thank you Coleen (and Kim)!
>>>>>
>>>>> #1 Need copyright year updates:
>>>>>
>>>>> src/hotspot/share/oops/symbol.cpp
>>>>> src/hotspot/share/classfile/symbolTable.cpp
>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp
>>>>
>>>> Yes, I'll update with my commit.
>>>>>
>>>>> #2 What’s the purpose of this code in 
>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>
>>>>>    38   STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1));
>>>>>
>>>>> when we have:
>>>>>
>>>>>   117   enum {
>>>>>   118     // max_symbol_length is constrained by type of _length
>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>   120   };
>>>>>
>>>>> Wouldn’t that always be true?  Is it to make sure that nobody 
>>>>> changes max_symbol_length, because the implementation needs it to 
>>>>> be that? If so, should we add comment to:
>>>>>
>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>
>>>>> with a big warning of some sorts?
>>>>
>>>> Yes, it's so that we can store the length of the symbol into 16 bits.
>>>>
>>>> How I change the comment above max_symbol_length from:
>>>>
>>>>     // max_symbol_length is constrained by type of _length
>>>>
>>>> to
>>>>
>>>>     // max_symbol_length must fit into the top 16 bits of 
>>>> _length_and_refcount
>>>>
>>>>>
>>>>> #3 If we have:
>>>>>
>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1));
>>>>>
>>>>> then why not
>>>>>
>>>>>   101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff
>>>>>
>>>>> or
>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == 0xffff;
>>>>>   101 #define PERM_REFCOUNT 0xffff
>>>>>
>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent.
>>>>
>>>>> #4 We have:
>>>>>
>>>>>   221 void Symbol::increment_refcount() {
>>>>>   222   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>>>   223     if (!try_increment_refcount()) {
>>>>>   224 #ifdef ASSERT
>>>>>   225       print();
>>>>>   226 #endif
>>>>>   227       fatal("refcount has gone to zero");
>>>>>
>>>>> but
>>>>>
>>>>>   233 void Symbol::decrement_refcount() {
>>>>>   234   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>>>   235     int new_value = Atomic::sub((uint32_t)1, 
>>>>> &_length_and_refcount);
>>>>>   236 #ifdef ASSERT
>>>>>   237     // Check if we have transitioned to 0xffff
>>>>>   238     if (extract_refcount(new_value) == PERM_REFCOUNT) {
>>>>>   239       print();
>>>>>   240       fatal("refcount underflow");
>>>>>   241     }
>>>>>   242 #endif
>>>>>
>>>>> Where the line:
>>>>>
>>>>>   240       fatal("refcount underflow”);
>>>>>
>>>>> is inside #ifdef ASSERT, but:
>>>>>
>>>>> 227       fatal("refcount has gone to zero”);
>>>>>
>>>>> is outside. Shouldn't “fatal" be consistent in both?
>>>>>
>>>>
>>>> I was thought that looked strange too.  I'll move the #endif from 
>>>> 226 to after 227.
>>>>
>>>> Thank you for reviewing the code!
>>>> Coleen
>>>>
>>>>> cheers
>>>>>
>>>>>
>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and 
>>>>>> pack refcount and length into an int.
>>>>>>
>>>>>> This is a precurser change to the concurrent SymbolTable change. 
>>>>>> Zeroed refcounted entries can be deleted at anytime so they 
>>>>>> cannot be allowed to be zero in runtime code. Thanks to Kim for 
>>>>>> writing the packing function and helping me avoid undefined 
>>>>>> behavior.
>>>>>>
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359
>>>>>>
>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including 
>>>>>> solaris. Added multithreaded gtest which exercises the code.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list