RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Thu Jul 19 19:09:40 UTC 2018



On 7/19/18 3:07 PM, Ioi Lam wrote:
> Looks good!

Thanks, Ioi!
Coleen
>
> Thanks
>
> - Ioi
>
>
> On 7/19/18 5:34 AM, coleen.phillimore at oracle.com wrote:
>> Please review the revision to this change.   Summary:
>>
>> * made decrement_refcount() use CAS loop.
>> * fixed duplicated logic in try_increment_refcount() thanks to Kim
>> * added gtest case for decrement_refcount.
>> * fixed SA code.
>> * added a bunch of comments
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev
>>
>> Retested with hs-tier1-3.
>> Thanks,
>> Coleen
>>
>> On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 7/18/18 6:35 PM, Ioi Lam wrote:
>>>>
>>>>
>>>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>>
>>>>> On 7/18/18 5:14 PM, Ioi Lam wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> The changes look good! The new operations on _length_and_refcount 
>>>>>> are much cleaner than my old ATOMIC_SHORT_PAIR hack.
>>>>>
>>>>> Yes, this makes more sense to me.
>>>>>>
>>>>>> symbolTable.cpp:
>>>>>>
>>>>>>  SymbolTable::lookup_dynamic() {
>>>>>>  ...
>>>>>>  214       Symbol* sym = e->literal();
>>>>>>  215       if (sym->equals(name, len) && 
>>>>>> sym->try_increment_refcount()) {
>>>>>>  216         // something is referencing this symbol now.
>>>>>>  217         return sym;
>>>>>>  218       }
>>>>>>
>>>>>>
>>>>>> symbol.cpp:
>>>>>>
>>>>>>  221 void Symbol::increment_refcount() {
>>>>>>  222   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>>>>  223     if (!try_increment_refcount()) {
>>>>>>  224 #ifdef ASSERT
>>>>>>  225       print();
>>>>>>  226 #endif
>>>>>>  227       fatal("refcount has gone to zero");
>>>>>>  228     }
>>>>>>  229     NOT_PRODUCT(Atomic::inc(&_total_count);)
>>>>>>  230   }
>>>>>>  231 }
>>>>>>
>>>>>>  246 // Atomically increment while checking for zero, zero is bad.
>>>>>>  247 bool Symbol::try_increment_refcount() {
>>>>>>  248   uint32_t old_value = _length_and_refcount;  // fetch once
>>>>>>  249   int refc = extract_refcount(old_value);
>>>>>>  250
>>>>>>  251   if (refc == PERM_REFCOUNT) {
>>>>>>  252     return true;
>>>>>>  253   } else if (refc == 0) {
>>>>>>  254     return false; // effectively dead, can't revive
>>>>>>  255   }
>>>>>>  256
>>>>>>  257   uint32_t now;
>>>>>>  258   while ((now = Atomic::cmpxchg(old_value + 1, 
>>>>>> &_length_and_refcount, old_value)) != old_value) {
>>>>>>  259     // failed to increment, check refcount again.
>>>>>>  260     refc = extract_refcount(now);
>>>>>>  261     if (refc == 0) {
>>>>>>  262       return false; // just died
>>>>>>  263     } else if (refc == PERM_REFCOUNT) {
>>>>>>  264       return true; // just became permanent
>>>>>>  265     }
>>>>>>  266     old_value = now; // refcount changed, try again
>>>>>>  267   }
>>>>>>  268   return true;
>>>>>>  269 }
>>>>>>
>>>>>>
>>>>>> So is it valid for Symbol::try_increment_refcount() to return 
>>>>>> false? SymbolTable::lookup_dynamic() seems to suggest YES, but 
>>>>>> Symbol::increment_refcount() seems to suggest NO.
>>>>>
>>>>> True.  If you are looking up a symbol and someone other thread has 
>>>>> decremented the refcount to zero, this symbol should not be 
>>>>> returned.  My test exercises this code even without the concurrent 
>>>>> hashtable.  When the hashtable is concurrent, a zero-ed Symbol 
>>>>> could be deallocated so we don't want to return it.
>>>>>
>>>> I think the following should be added as a comment in 
>>>> increment_refcount().
>>>>> In the case where you call increment_refcount() not during lookup, 
>>>>> it is assumed that you have a symbol with a non-zero refcount and 
>>>>> it can't go away while you are holding it.
>>>
>>> Ok, added.
>>>>
>>>>>>
>>>>>> If it's always an invalid condition, I think the fatal() should 
>>>>>> be moved inside try_increment_refcount.
>>>>>>
>>>>>
>>>>> It isn't fatal at lookup.  The lookup must skip a zero-ed entry.
>>>>>> Otherwise, I think you need to add comments in all 3 places, to 
>>>>>> say when it's possible to get a 0 refcount, and when it's not. 
>>>>>> And, it might be worth expanding on why "zero is bad" :-)
>>>>>
>>>>> How about this comment to try_increment_refcount:
>>>>>
>>>>> // Increment refcount while checking for zero.  If the Symbol's 
>>>>> refcount becomes zero
>>>>> // a thread could be concurrently removing the Symbol. This is 
>>>>> used during SymbolTable
>>>>> // lookup to avoid reviving a dead Symbol.
>>>> Sounds good.
>>>
>>> Thanks, Ioi.
>>> Coleen
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>>>
>>>>>> My guess is:
>>>>>> + if you're doing a lookup, you might be seeing Symbols that have 
>>>>>> already been marked for deletion, which is indicated by a 0 
>>>>>> refcount. You want to skip such Symbols.
>>>>>>
>>>>>> + if you're incrementing the refcount, that means you're holding 
>>>>>> a valid Symbol, which means this Symbol should have never been 
>>>>>> marked for deletion.
>>>>>>
>>>>>> Is this correct?
>>>>>
>>>>> Yes, both true.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>>>
>>>>>>
>>>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>> Gerard, thank you for the code review.
>>>>>>>
>>>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote:
>>>>>>>> Thank you Coleen (and Kim)!
>>>>>>>>
>>>>>>>> #1 Need copyright year updates:
>>>>>>>>
>>>>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>>>> src/hotspot/share/classfile/symbolTable.cpp
>>>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp
>>>>>>>
>>>>>>> Yes, I'll update with my commit.
>>>>>>>>
>>>>>>>> #2 What’s the purpose of this code in 
>>>>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>>>>
>>>>>>>>    38   STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1));
>>>>>>>>
>>>>>>>> when we have:
>>>>>>>>
>>>>>>>>   117   enum {
>>>>>>>>   118     // max_symbol_length is constrained by type of _length
>>>>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>>>>   120   };
>>>>>>>>
>>>>>>>> Wouldn’t that always be true?  Is it to make sure that nobody 
>>>>>>>> changes max_symbol_length, because the implementation needs it 
>>>>>>>> to be that? If so, should we add comment to:
>>>>>>>>
>>>>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>>>>
>>>>>>>> with a big warning of some sorts?
>>>>>>>
>>>>>>> Yes, it's so that we can store the length of the symbol into 16 
>>>>>>> bits.
>>>>>>>
>>>>>>> How I change the comment above max_symbol_length from:
>>>>>>>
>>>>>>>     // max_symbol_length is constrained by type of _length
>>>>>>>
>>>>>>> to
>>>>>>>
>>>>>>>     // max_symbol_length must fit into the top 16 bits of 
>>>>>>> _length_and_refcount
>>>>>>>
>>>>>>>>
>>>>>>>> #3 If we have:
>>>>>>>>
>>>>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1));
>>>>>>>>
>>>>>>>> then why not
>>>>>>>>
>>>>>>>>   101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff
>>>>>>>>
>>>>>>>> or
>>>>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == 0xffff;
>>>>>>>>   101 #define PERM_REFCOUNT 0xffff
>>>>>>>>
>>>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent.
>>>>>>>
>>>>>>>> #4 We have:
>>>>>>>>
>>>>>>>>   221 void Symbol::increment_refcount() {
>>>>>>>>   222   if (refcount() != PERM_REFCOUNT) { // not a permanent 
>>>>>>>> symbol
>>>>>>>>   223     if (!try_increment_refcount()) {
>>>>>>>>   224 #ifdef ASSERT
>>>>>>>>   225       print();
>>>>>>>>   226 #endif
>>>>>>>>   227       fatal("refcount has gone to zero");
>>>>>>>>
>>>>>>>> but
>>>>>>>>
>>>>>>>>   233 void Symbol::decrement_refcount() {
>>>>>>>>   234   if (refcount() != PERM_REFCOUNT) { // not a permanent 
>>>>>>>> symbol
>>>>>>>>   235     int new_value = Atomic::sub((uint32_t)1, 
>>>>>>>> &_length_and_refcount);
>>>>>>>>   236 #ifdef ASSERT
>>>>>>>>   237     // Check if we have transitioned to 0xffff
>>>>>>>>   238     if (extract_refcount(new_value) == PERM_REFCOUNT) {
>>>>>>>>   239       print();
>>>>>>>>   240       fatal("refcount underflow");
>>>>>>>>   241     }
>>>>>>>>   242 #endif
>>>>>>>>
>>>>>>>> Where the line:
>>>>>>>>
>>>>>>>>   240       fatal("refcount underflow”);
>>>>>>>>
>>>>>>>> is inside #ifdef ASSERT, but:
>>>>>>>>
>>>>>>>> 227       fatal("refcount has gone to zero”);
>>>>>>>>
>>>>>>>> is outside. Shouldn't “fatal" be consistent in both?
>>>>>>>>
>>>>>>>
>>>>>>> I was thought that looked strange too.  I'll move the #endif 
>>>>>>> from 226 to after 227.
>>>>>>>
>>>>>>> Thank you for reviewing the code!
>>>>>>> Coleen
>>>>>>>
>>>>>>>> cheers
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>>>
>>>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and 
>>>>>>>>> pack refcount and length into an int.
>>>>>>>>>
>>>>>>>>> This is a precurser change to the concurrent SymbolTable 
>>>>>>>>> change. Zeroed refcounted entries can be deleted at anytime so 
>>>>>>>>> they cannot be allowed to be zero in runtime code. Thanks to 
>>>>>>>>> Kim for writing the packing function and helping me avoid 
>>>>>>>>> undefined behavior.
>>>>>>>>>
>>>>>>>>> open webrev at 
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8207359.01/webrev
>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359
>>>>>>>>>
>>>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including 
>>>>>>>>> solaris. Added multithreaded gtest which exercises the code.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list