RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Thu Jul 19 22:14:58 UTC 2018


Hi, There is a closed test that does 100,000 lookups on a class that 
fails resolution, so creates 100,000 Symbols with TempNewSymbol. This 
results in many zeroed refcounted Symbols in the table which increases 
lookup time with the current SymbolTable.  With the new concurrent 
symbol table, which this change is intended to support, the zero 
refcount symbols are cleaned up on insert and concurrently.

I have a workaround so that this test doesn't time out.   These are the 
times for this test on my machine.

old hashtable no patch: 7.32 seconds
without workaround: 367 seconds (which can time out on a slow machine)
with workaround:  61.075 seconds
with new hashtable: 9.135 seconds

There are several ways to fix the old hashtable so that it cleans more 
frequently for this situation but it's not worth doing with the new 
concurrent hashtable coming.

open webrev at http://cr.openjdk.java.net/~coleenp/03.incr/webrev

Thanks,
Coleen

On 7/19/18 3:09 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 7/19/18 3:07 PM, Ioi Lam wrote:
>> Looks good!
>
> Thanks, Ioi!
> Coleen
>>
>> Thanks
>>
>> - Ioi
>>
>>
>> On 7/19/18 5:34 AM, coleen.phillimore at oracle.com wrote:
>>> Please review the revision to this change.   Summary:
>>>
>>> * made decrement_refcount() use CAS loop.
>>> * fixed duplicated logic in try_increment_refcount() thanks to Kim
>>> * added gtest case for decrement_refcount.
>>> * fixed SA code.
>>> * added a bunch of comments
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.02/webrev
>>>
>>> Retested with hs-tier1-3.
>>> Thanks,
>>> Coleen
>>>
>>> On 7/18/18 10:50 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 7/18/18 6:35 PM, Ioi Lam wrote:
>>>>>
>>>>>
>>>>> On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>>
>>>>>> On 7/18/18 5:14 PM, Ioi Lam wrote:
>>>>>>> Hi Coleen,
>>>>>>>
>>>>>>> The changes look good! The new operations on 
>>>>>>> _length_and_refcount are much cleaner than my old 
>>>>>>> ATOMIC_SHORT_PAIR hack.
>>>>>>
>>>>>> Yes, this makes more sense to me.
>>>>>>>
>>>>>>> symbolTable.cpp:
>>>>>>>
>>>>>>>  SymbolTable::lookup_dynamic() {
>>>>>>>  ...
>>>>>>>  214       Symbol* sym = e->literal();
>>>>>>>  215       if (sym->equals(name, len) && 
>>>>>>> sym->try_increment_refcount()) {
>>>>>>>  216         // something is referencing this symbol now.
>>>>>>>  217         return sym;
>>>>>>>  218       }
>>>>>>>
>>>>>>>
>>>>>>> symbol.cpp:
>>>>>>>
>>>>>>>  221 void Symbol::increment_refcount() {
>>>>>>>  222   if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>>>>>  223     if (!try_increment_refcount()) {
>>>>>>>  224 #ifdef ASSERT
>>>>>>>  225       print();
>>>>>>>  226 #endif
>>>>>>>  227       fatal("refcount has gone to zero");
>>>>>>>  228     }
>>>>>>>  229     NOT_PRODUCT(Atomic::inc(&_total_count);)
>>>>>>>  230   }
>>>>>>>  231 }
>>>>>>>
>>>>>>>  246 // Atomically increment while checking for zero, zero is bad.
>>>>>>>  247 bool Symbol::try_increment_refcount() {
>>>>>>>  248   uint32_t old_value = _length_and_refcount;  // fetch once
>>>>>>>  249   int refc = extract_refcount(old_value);
>>>>>>>  250
>>>>>>>  251   if (refc == PERM_REFCOUNT) {
>>>>>>>  252     return true;
>>>>>>>  253   } else if (refc == 0) {
>>>>>>>  254     return false; // effectively dead, can't revive
>>>>>>>  255   }
>>>>>>>  256
>>>>>>>  257   uint32_t now;
>>>>>>>  258   while ((now = Atomic::cmpxchg(old_value + 1, 
>>>>>>> &_length_and_refcount, old_value)) != old_value) {
>>>>>>>  259     // failed to increment, check refcount again.
>>>>>>>  260     refc = extract_refcount(now);
>>>>>>>  261     if (refc == 0) {
>>>>>>>  262       return false; // just died
>>>>>>>  263     } else if (refc == PERM_REFCOUNT) {
>>>>>>>  264       return true; // just became permanent
>>>>>>>  265     }
>>>>>>>  266     old_value = now; // refcount changed, try again
>>>>>>>  267   }
>>>>>>>  268   return true;
>>>>>>>  269 }
>>>>>>>
>>>>>>>
>>>>>>> So is it valid for Symbol::try_increment_refcount() to return 
>>>>>>> false? SymbolTable::lookup_dynamic() seems to suggest YES, but 
>>>>>>> Symbol::increment_refcount() seems to suggest NO.
>>>>>>
>>>>>> True.  If you are looking up a symbol and someone other thread 
>>>>>> has decremented the refcount to zero, this symbol should not be 
>>>>>> returned.  My test exercises this code even without the 
>>>>>> concurrent hashtable.  When the hashtable is concurrent, a 
>>>>>> zero-ed Symbol could be deallocated so we don't want to return it.
>>>>>>
>>>>> I think the following should be added as a comment in 
>>>>> increment_refcount().
>>>>>> In the case where you call increment_refcount() not during 
>>>>>> lookup, it is assumed that you have a symbol with a non-zero 
>>>>>> refcount and it can't go away while you are holding it.
>>>>
>>>> Ok, added.
>>>>>
>>>>>>>
>>>>>>> If it's always an invalid condition, I think the fatal() should 
>>>>>>> be moved inside try_increment_refcount.
>>>>>>>
>>>>>>
>>>>>> It isn't fatal at lookup.  The lookup must skip a zero-ed entry.
>>>>>>> Otherwise, I think you need to add comments in all 3 places, to 
>>>>>>> say when it's possible to get a 0 refcount, and when it's not. 
>>>>>>> And, it might be worth expanding on why "zero is bad" :-)
>>>>>>
>>>>>> How about this comment to try_increment_refcount:
>>>>>>
>>>>>> // Increment refcount while checking for zero.  If the Symbol's 
>>>>>> refcount becomes zero
>>>>>> // a thread could be concurrently removing the Symbol. This is 
>>>>>> used during SymbolTable
>>>>>> // lookup to avoid reviving a dead Symbol.
>>>>> Sounds good.
>>>>
>>>> Thanks, Ioi.
>>>> Coleen
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>>>
>>>>>>> My guess is:
>>>>>>> + if you're doing a lookup, you might be seeing Symbols that 
>>>>>>> have already been marked for deletion, which is indicated by a 0 
>>>>>>> refcount. You want to skip such Symbols.
>>>>>>>
>>>>>>> + if you're incrementing the refcount, that means you're holding 
>>>>>>> a valid Symbol, which means this Symbol should have never been 
>>>>>>> marked for deletion.
>>>>>>>
>>>>>>> Is this correct?
>>>>>>
>>>>>> Yes, both true.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>>>
>>>>>>>
>>>>>>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>>
>>>>>>>> Gerard, thank you for the code review.
>>>>>>>>
>>>>>>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote:
>>>>>>>>> Thank you Coleen (and Kim)!
>>>>>>>>>
>>>>>>>>> #1 Need copyright year updates:
>>>>>>>>>
>>>>>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>>>>> src/hotspot/share/classfile/symbolTable.cpp
>>>>>>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp
>>>>>>>>
>>>>>>>> Yes, I'll update with my commit.
>>>>>>>>>
>>>>>>>>> #2 What’s the purpose of this code in 
>>>>>>>>> src/hotspot/share/oops/symbol.cpp
>>>>>>>>>
>>>>>>>>>    38   STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1));
>>>>>>>>>
>>>>>>>>> when we have:
>>>>>>>>>
>>>>>>>>>   117   enum {
>>>>>>>>>   118     // max_symbol_length is constrained by type of _length
>>>>>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>>>>>   120   };
>>>>>>>>>
>>>>>>>>> Wouldn’t that always be true?  Is it to make sure that nobody 
>>>>>>>>> changes max_symbol_length, because the implementation needs it 
>>>>>>>>> to be that? If so, should we add comment to:
>>>>>>>>>
>>>>>>>>>   119     max_symbol_length = (1 << 16) -1
>>>>>>>>>
>>>>>>>>> with a big warning of some sorts?
>>>>>>>>
>>>>>>>> Yes, it's so that we can store the length of the symbol into 16 
>>>>>>>> bits.
>>>>>>>>
>>>>>>>> How I change the comment above max_symbol_length from:
>>>>>>>>
>>>>>>>>     // max_symbol_length is constrained by type of _length
>>>>>>>>
>>>>>>>> to
>>>>>>>>
>>>>>>>>     // max_symbol_length must fit into the top 16 bits of 
>>>>>>>> _length_and_refcount
>>>>>>>>
>>>>>>>>>
>>>>>>>>> #3 If we have:
>>>>>>>>>
>>>>>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1));
>>>>>>>>>
>>>>>>>>> then why not
>>>>>>>>>
>>>>>>>>>   101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff
>>>>>>>>>
>>>>>>>>> or
>>>>>>>>>    39   STATIC_ASSERT(PERM_REFCOUNT == 0xffff;
>>>>>>>>>   101 #define PERM_REFCOUNT 0xffff
>>>>>>>>>
>>>>>>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent.
>>>>>>>>
>>>>>>>>> #4 We have:
>>>>>>>>>
>>>>>>>>>   221 void Symbol::increment_refcount() {
>>>>>>>>>   222   if (refcount() != PERM_REFCOUNT) { // not a permanent 
>>>>>>>>> symbol
>>>>>>>>>   223     if (!try_increment_refcount()) {
>>>>>>>>>   224 #ifdef ASSERT
>>>>>>>>>   225       print();
>>>>>>>>>   226 #endif
>>>>>>>>>   227       fatal("refcount has gone to zero");
>>>>>>>>>
>>>>>>>>> but
>>>>>>>>>
>>>>>>>>>   233 void Symbol::decrement_refcount() {
>>>>>>>>>   234   if (refcount() != PERM_REFCOUNT) { // not a permanent 
>>>>>>>>> symbol
>>>>>>>>>   235     int new_value = Atomic::sub((uint32_t)1, 
>>>>>>>>> &_length_and_refcount);
>>>>>>>>>   236 #ifdef ASSERT
>>>>>>>>>   237     // Check if we have transitioned to 0xffff
>>>>>>>>>   238     if (extract_refcount(new_value) == PERM_REFCOUNT) {
>>>>>>>>>   239       print();
>>>>>>>>>   240       fatal("refcount underflow");
>>>>>>>>>   241     }
>>>>>>>>>   242 #endif
>>>>>>>>>
>>>>>>>>> Where the line:
>>>>>>>>>
>>>>>>>>>   240       fatal("refcount underflow”);
>>>>>>>>>
>>>>>>>>> is inside #ifdef ASSERT, but:
>>>>>>>>>
>>>>>>>>> 227       fatal("refcount has gone to zero”);
>>>>>>>>>
>>>>>>>>> is outside. Shouldn't “fatal" be consistent in both?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I was thought that looked strange too.  I'll move the #endif 
>>>>>>>> from 226 to after 227.
>>>>>>>>
>>>>>>>> Thank you for reviewing the code!
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>>> cheers
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com 
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, 
>>>>>>>>>> and pack refcount and length into an int.
>>>>>>>>>>
>>>>>>>>>> This is a precurser change to the concurrent SymbolTable 
>>>>>>>>>> change. Zeroed refcounted entries can be deleted at anytime 
>>>>>>>>>> so they cannot be allowed to be zero in runtime code. Thanks 
>>>>>>>>>> to Kim for writing the packing function and helping me avoid 
>>>>>>>>>> undefined behavior.
>>>>>>>>>>
>>>>>>>>>> open webrev at 
>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8207359.01/webrev
>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359
>>>>>>>>>>
>>>>>>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including 
>>>>>>>>>> solaris. Added multithreaded gtest which exercises the code.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list