RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Thu Jul 19 12:34:56 UTC 2018


Please review the revision to this change.   Summary:

* made decrement_refcount() use CAS loop.
* fixed duplicated logic in try_increment_refcount() thanks to Kim
* added gtest case for decrement_refcount.
* fixed SA code.
* added a bunch of comments

open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev

Retested with hs-tier1-3.
Thanks,
Coleen

On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 7/18/18 6:35 PM, Ioi Lam wrote:
>>
>>
>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 7/18/18 5:14 PM, Ioi Lam wrote:
>>>> Hi Coleen,
>>>>
>>>> The changes look good! The new operations on _length_and_refcount 
>>>> are much cleaner than my old ATOMIC_SHORT_PAIR hack.
>>>
>>> Yes, this makes more sense to me.
>>>>
>>>> symbolTable.cpp:
>>>>
>>>>  SymbolTable::lookup_dynamic() {
>>>>  ...
>>>>  214       Symbol* sym = e->literal();
>>>>  215       if (sym->equals(name, len) && 
>>>> sym->try_increment_refcount()) {
>>>>  216         // something is referencing this symbol now.
>>>>  217         return sym;
>>>>  218       }
>>>>
>>>>
>>>> symbol.cpp:
>>>>
>>>>  221 void Symbol::increment_refcount() {
>>>>  222   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>>  223     if (!try_increment_refcount()) {
>>>>  224 #ifdef ASSERT
>>>>  225       print();
>>>>  226 #endif
>>>>  227       fatal("refcount has gone to zero");
>>>>  228     }
>>>>  229     NOT_PRODUCT(Atomic::inc(&_total_count);)
>>>>  230   }
>>>>  231 }
>>>>
>>>>  246 // Atomically increment while checking for zero, zero is bad.
>>>>  247 bool Symbol::try_increment_refcount() {
>>>>  248   uint32_t old_value = _length_and_refcount;  // fetch once
>>>>  249   int refc = extract_refcount(old_value);
>>>>  250
>>>>  251   if (refc == PERM_REFCOUNT) {
>>>>  252     return true;
>>>>  253   } else if (refc == 0) {
>>>>  254     return false; // effectively dead, can't revive
>>>>  255   }
>>>>  256
>>>>  257   uint32_t now;
>>>>  258   while ((now = Atomic::cmpxchg(old_value + 1, 
>>>> &_length_and_refcount, old_value)) != old_value) {
>>>>  259     // failed to increment, check refcount again.
>>>>  260     refc = extract_refcount(now);
>>>>  261     if (refc == 0) {
>>>>  262       return false; // just died
>>>>  263     } else if (refc == PERM_REFCOUNT) {
>>>>  264       return true; // just became permanent
>>>>  265     }
>>>>  266     old_value = now; // refcount changed, try again
>>>>  267   }
>>>>  268   return true;
>>>>  269 }
>>>>
>>>>
>>>> So is it valid for Symbol::try_increment_refcount() to return 
>>>> false? SymbolTable::lookup_dynamic() seems to suggest YES, but 
>>>> Symbol::increment_refcount() seems to suggest NO.
>>>
>>> True.  If you are looking up a symbol and someone other thread has 
>>> decremented the refcount to zero, this symbol should not be 
>>> returned.  My test exercises this code even without the concurrent 
>>> hashtable.  When the hashtable is concurrent, a zero-ed Symbol could 
>>> be deallocated so we don't want to return it.
>>>
>> I think the following should be added as a comment in 
>> increment_refcount().
>>> In the case where you call increment_refcount() not during lookup, 
>>> it is assumed that you have a symbol with a non-zero refcount and it 
>>> can't go away while you are holding it.
>
> Ok, added.
>>
>>>>
>>>> If it's always an invalid condition, I think the fatal() should be 
>>>> moved inside try_increment_refcount.
>>>>
>>>
>>> It isn't fatal at lookup.  The lookup must skip a zero-ed entry.
>>>> Otherwise, I think you need to add comments in all 3 places, to say 
>>>> when it's possible to get a 0 refcount, and when it's not. And, it 
>>>> might be worth expanding on why "zero is bad" :-)
>>>
>>> How about this comment to try_increment_refcount:
>>>
>>> // Increment refcount while checking for zero.  If the Symbol's 
>>> refcount becomes zero
>>> // a thread could be concurrently removing the Symbol.  This is used 
>>> during SymbolTable
>>> // lookup to avoid reviving a dead Symbol.
>> Sounds good.
>
> Thanks, Ioi.
> Coleen
>>
>> Thanks
>> - Ioi
>>
>>>>
>>>> My guess is:
>>>> + if you're doing a lookup, you might be seeing Symbols that have 
>>>> already been marked for deletion, which is indicated by a 0 
>>>> refcount. You want to skip such Symbols.
>>>>
>>>> + if you're incrementing the refcount, that means you're holding a 
>>>> valid Symbol, which means this Symbol should have never been marked 
>>>> for deletion.
>>>>
>>>> Is this correct?
>>>
>>> Yes, both true.
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>
>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Gerard, thank you for the code review.
>>>>>
>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote:
>>>>>> Thank you Coleen (and Kim)!
>>>>>>
>>>>>> #1 Need copyright year updates:
>>>>>>
>>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>> src/hotspot/share/classfile/symbolTable.cpp
>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp
>>>>>
>>>>> Yes, I'll update with my commit.
>>>>>>
>>>>>> #2 What’s the purpose of this code in 
>>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>>
>>>>>>    38   STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1));
>>>>>>
>>>>>> when we have:
>>>>>>
>>>>>>   117   enum {
>>>>>>   118     // max_symbol_length is constrained by type of _length
>>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>>   120   };
>>>>>>
>>>>>> Wouldn’t that always be true?  Is it to make sure that nobody 
>>>>>> changes max_symbol_length, because the implementation needs it to 
>>>>>> be that? If so, should we add comment to:
>>>>>>
>>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>>
>>>>>> with a big warning of some sorts?
>>>>>
>>>>> Yes, it's so that we can store the length of the symbol into 16 bits.
>>>>>
>>>>> How I change the comment above max_symbol_length from:
>>>>>
>>>>>     // max_symbol_length is constrained by type of _length
>>>>>
>>>>> to
>>>>>
>>>>>     // max_symbol_length must fit into the top 16 bits of 
>>>>> _length_and_refcount
>>>>>
>>>>>>
>>>>>> #3 If we have:
>>>>>>
>>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1));
>>>>>>
>>>>>> then why not
>>>>>>
>>>>>>   101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff
>>>>>>
>>>>>> or
>>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == 0xffff;
>>>>>>   101 #define PERM_REFCOUNT 0xffff
>>>>>>
>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent.
>>>>>
>>>>>> #4 We have:
>>>>>>
>>>>>>   221 void Symbol::increment_refcount() {
>>>>>>   222   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>>>>   223     if (!try_increment_refcount()) {
>>>>>>   224 #ifdef ASSERT
>>>>>>   225       print();
>>>>>>   226 #endif
>>>>>>   227       fatal("refcount has gone to zero");
>>>>>>
>>>>>> but
>>>>>>
>>>>>>   233 void Symbol::decrement_refcount() {
>>>>>>   234   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>>>>   235     int new_value = Atomic::sub((uint32_t)1, 
>>>>>> &_length_and_refcount);
>>>>>>   236 #ifdef ASSERT
>>>>>>   237     // Check if we have transitioned to 0xffff
>>>>>>   238     if (extract_refcount(new_value) == PERM_REFCOUNT) {
>>>>>>   239       print();
>>>>>>   240       fatal("refcount underflow");
>>>>>>   241     }
>>>>>>   242 #endif
>>>>>>
>>>>>> Where the line:
>>>>>>
>>>>>>   240       fatal("refcount underflow”);
>>>>>>
>>>>>> is inside #ifdef ASSERT, but:
>>>>>>
>>>>>> 227       fatal("refcount has gone to zero”);
>>>>>>
>>>>>> is outside. Shouldn't “fatal" be consistent in both?
>>>>>>
>>>>>
>>>>> I was thought that looked strange too.  I'll move the #endif from 
>>>>> 226 to after 227.
>>>>>
>>>>> Thank you for reviewing the code!
>>>>> Coleen
>>>>>
>>>>>> cheers
>>>>>>
>>>>>>
>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote:
>>>>>>>
>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and 
>>>>>>> pack refcount and length into an int.
>>>>>>>
>>>>>>> This is a precurser change to the concurrent SymbolTable change. 
>>>>>>> Zeroed refcounted entries can be deleted at anytime so they 
>>>>>>> cannot be allowed to be zero in runtime code. Thanks to Kim for 
>>>>>>> writing the packing function and helping me avoid undefined 
>>>>>>> behavior.
>>>>>>>
>>>>>>> open webrev at 
>>>>>>> http://cr.openjdk.java.net/~coleenp/8207359.01/webrev
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359
>>>>>>>
>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including 
>>>>>>> solaris. Added multithreaded gtest which exercises the code.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list