RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero

Ioi Lam ioi.lam at oracle.com
Thu Jul 19 19:07:31 UTC 2018


Looks good!

Thanks

- Ioi


On 7/19/18 5:34 AM, coleen.phillimore at oracle.com wrote:
> Please review the revision to this change.   Summary:
>
> * made decrement_refcount() use CAS loop.
> * fixed duplicated logic in try_increment_refcount() thanks to Kim
> * added gtest case for decrement_refcount.
> * fixed SA code.
> * added a bunch of comments
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev
>
> Retested with hs-tier1-3.
> Thanks,
> Coleen
>
> On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 7/18/18 6:35 PM, Ioi Lam wrote:
>>>
>>>
>>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 7/18/18 5:14 PM, Ioi Lam wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> The changes look good! The new operations on _length_and_refcount 
>>>>> are much cleaner than my old ATOMIC_SHORT_PAIR hack.
>>>>
>>>> Yes, this makes more sense to me.
>>>>>
>>>>> symbolTable.cpp:
>>>>>
>>>>>  SymbolTable::lookup_dynamic() {
>>>>>  ...
>>>>>  214       Symbol* sym = e->literal();
>>>>>  215       if (sym->equals(name, len) && 
>>>>> sym->try_increment_refcount()) {
>>>>>  216         // something is referencing this symbol now.
>>>>>  217         return sym;
>>>>>  218       }
>>>>>
>>>>>
>>>>> symbol.cpp:
>>>>>
>>>>>  221 void Symbol::increment_refcount() {
>>>>>  222   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>>>  223     if (!try_increment_refcount()) {
>>>>>  224 #ifdef ASSERT
>>>>>  225       print();
>>>>>  226 #endif
>>>>>  227       fatal("refcount has gone to zero");
>>>>>  228     }
>>>>>  229     NOT_PRODUCT(Atomic::inc(&_total_count);)
>>>>>  230   }
>>>>>  231 }
>>>>>
>>>>>  246 // Atomically increment while checking for zero, zero is bad.
>>>>>  247 bool Symbol::try_increment_refcount() {
>>>>>  248   uint32_t old_value = _length_and_refcount;  // fetch once
>>>>>  249   int refc = extract_refcount(old_value);
>>>>>  250
>>>>>  251   if (refc == PERM_REFCOUNT) {
>>>>>  252     return true;
>>>>>  253   } else if (refc == 0) {
>>>>>  254     return false; // effectively dead, can't revive
>>>>>  255   }
>>>>>  256
>>>>>  257   uint32_t now;
>>>>>  258   while ((now = Atomic::cmpxchg(old_value + 1, 
>>>>> &_length_and_refcount, old_value)) != old_value) {
>>>>>  259     // failed to increment, check refcount again.
>>>>>  260     refc = extract_refcount(now);
>>>>>  261     if (refc == 0) {
>>>>>  262       return false; // just died
>>>>>  263     } else if (refc == PERM_REFCOUNT) {
>>>>>  264       return true; // just became permanent
>>>>>  265     }
>>>>>  266     old_value = now; // refcount changed, try again
>>>>>  267   }
>>>>>  268   return true;
>>>>>  269 }
>>>>>
>>>>>
>>>>> So is it valid for Symbol::try_increment_refcount() to return 
>>>>> false? SymbolTable::lookup_dynamic() seems to suggest YES, but 
>>>>> Symbol::increment_refcount() seems to suggest NO.
>>>>
>>>> True.  If you are looking up a symbol and someone other thread has 
>>>> decremented the refcount to zero, this symbol should not be 
>>>> returned.  My test exercises this code even without the concurrent 
>>>> hashtable.  When the hashtable is concurrent, a zero-ed Symbol 
>>>> could be deallocated so we don't want to return it.
>>>>
>>> I think the following should be added as a comment in 
>>> increment_refcount().
>>>> In the case where you call increment_refcount() not during lookup, 
>>>> it is assumed that you have a symbol with a non-zero refcount and 
>>>> it can't go away while you are holding it.
>>
>> Ok, added.
>>>
>>>>>
>>>>> If it's always an invalid condition, I think the fatal() should be 
>>>>> moved inside try_increment_refcount.
>>>>>
>>>>
>>>> It isn't fatal at lookup.  The lookup must skip a zero-ed entry.
>>>>> Otherwise, I think you need to add comments in all 3 places, to 
>>>>> say when it's possible to get a 0 refcount, and when it's not. 
>>>>> And, it might be worth expanding on why "zero is bad" :-)
>>>>
>>>> How about this comment to try_increment_refcount:
>>>>
>>>> // Increment refcount while checking for zero.  If the Symbol's 
>>>> refcount becomes zero
>>>> // a thread could be concurrently removing the Symbol.  This is 
>>>> used during SymbolTable
>>>> // lookup to avoid reviving a dead Symbol.
>>> Sounds good.
>>
>> Thanks, Ioi.
>> Coleen
>>>
>>> Thanks
>>> - Ioi
>>>
>>>>>
>>>>> My guess is:
>>>>> + if you're doing a lookup, you might be seeing Symbols that have 
>>>>> already been marked for deletion, which is indicated by a 0 
>>>>> refcount. You want to skip such Symbols.
>>>>>
>>>>> + if you're incrementing the refcount, that means you're holding a 
>>>>> valid Symbol, which means this Symbol should have never been 
>>>>> marked for deletion.
>>>>>
>>>>> Is this correct?
>>>>
>>>> Yes, both true.
>>>>
>>>> Thanks,
>>>> Coleen
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>
>>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Gerard, thank you for the code review.
>>>>>>
>>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote:
>>>>>>> Thank you Coleen (and Kim)!
>>>>>>>
>>>>>>> #1 Need copyright year updates:
>>>>>>>
>>>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>>> src/hotspot/share/classfile/symbolTable.cpp
>>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp
>>>>>>
>>>>>> Yes, I'll update with my commit.
>>>>>>>
>>>>>>> #2 What’s the purpose of this code in 
>>>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>>>
>>>>>>>    38   STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1));
>>>>>>>
>>>>>>> when we have:
>>>>>>>
>>>>>>>   117   enum {
>>>>>>>   118     // max_symbol_length is constrained by type of _length
>>>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>>>   120   };
>>>>>>>
>>>>>>> Wouldn’t that always be true?  Is it to make sure that nobody 
>>>>>>> changes max_symbol_length, because the implementation needs it 
>>>>>>> to be that? If so, should we add comment to:
>>>>>>>
>>>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>>>
>>>>>>> with a big warning of some sorts?
>>>>>>
>>>>>> Yes, it's so that we can store the length of the symbol into 16 
>>>>>> bits.
>>>>>>
>>>>>> How I change the comment above max_symbol_length from:
>>>>>>
>>>>>>     // max_symbol_length is constrained by type of _length
>>>>>>
>>>>>> to
>>>>>>
>>>>>>     // max_symbol_length must fit into the top 16 bits of 
>>>>>> _length_and_refcount
>>>>>>
>>>>>>>
>>>>>>> #3 If we have:
>>>>>>>
>>>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1));
>>>>>>>
>>>>>>> then why not
>>>>>>>
>>>>>>>   101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff
>>>>>>>
>>>>>>> or
>>>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == 0xffff;
>>>>>>>   101 #define PERM_REFCOUNT 0xffff
>>>>>>>
>>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent.
>>>>>>
>>>>>>> #4 We have:
>>>>>>>
>>>>>>>   221 void Symbol::increment_refcount() {
>>>>>>>   222   if (refcount() != PERM_REFCOUNT) { // not a permanent 
>>>>>>> symbol
>>>>>>>   223     if (!try_increment_refcount()) {
>>>>>>>   224 #ifdef ASSERT
>>>>>>>   225       print();
>>>>>>>   226 #endif
>>>>>>>   227       fatal("refcount has gone to zero");
>>>>>>>
>>>>>>> but
>>>>>>>
>>>>>>>   233 void Symbol::decrement_refcount() {
>>>>>>>   234   if (refcount() != PERM_REFCOUNT) { // not a permanent 
>>>>>>> symbol
>>>>>>>   235     int new_value = Atomic::sub((uint32_t)1, 
>>>>>>> &_length_and_refcount);
>>>>>>>   236 #ifdef ASSERT
>>>>>>>   237     // Check if we have transitioned to 0xffff
>>>>>>>   238     if (extract_refcount(new_value) == PERM_REFCOUNT) {
>>>>>>>   239       print();
>>>>>>>   240       fatal("refcount underflow");
>>>>>>>   241     }
>>>>>>>   242 #endif
>>>>>>>
>>>>>>> Where the line:
>>>>>>>
>>>>>>>   240       fatal("refcount underflow”);
>>>>>>>
>>>>>>> is inside #ifdef ASSERT, but:
>>>>>>>
>>>>>>> 227       fatal("refcount has gone to zero”);
>>>>>>>
>>>>>>> is outside. Shouldn't “fatal" be consistent in both?
>>>>>>>
>>>>>>
>>>>>> I was thought that looked strange too.  I'll move the #endif from 
>>>>>> 226 to after 227.
>>>>>>
>>>>>> Thank you for reviewing the code!
>>>>>> Coleen
>>>>>>
>>>>>>> cheers
>>>>>>>
>>>>>>>
>>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and 
>>>>>>>> pack refcount and length into an int.
>>>>>>>>
>>>>>>>> This is a precurser change to the concurrent SymbolTable 
>>>>>>>> change. Zeroed refcounted entries can be deleted at anytime so 
>>>>>>>> they cannot be allowed to be zero in runtime code. Thanks to 
>>>>>>>> Kim for writing the packing function and helping me avoid 
>>>>>>>> undefined behavior.
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8207359.01/webrev
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359
>>>>>>>>
>>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including 
>>>>>>>> solaris. Added multithreaded gtest which exercises the code.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list