RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero
Ioi Lam
ioi.lam at oracle.com
Wed Jul 18 22:35:50 UTC 2018
On 7/18/18 2:45 PM, coleen.phillimore at oracle.com wrote:
>
>
> On 7/18/18 5:14 PM, Ioi Lam wrote:
>> Hi Coleen,
>>
>> The changes look good! The new operations on _length_and_refcount are
>> much cleaner than my old ATOMIC_SHORT_PAIR hack.
>
> Yes, this makes more sense to me.
>>
>> symbolTable.cpp:
>>
>> SymbolTable::lookup_dynamic() {
>> ...
>> 214 Symbol* sym = e->literal();
>> 215 if (sym->equals(name, len) &&
>> sym->try_increment_refcount()) {
>> 216 // something is referencing this symbol now.
>> 217 return sym;
>> 218 }
>>
>>
>> symbol.cpp:
>>
>> 221 void Symbol::increment_refcount() {
>> 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>> 223 if (!try_increment_refcount()) {
>> 224 #ifdef ASSERT
>> 225 print();
>> 226 #endif
>> 227 fatal("refcount has gone to zero");
>> 228 }
>> 229 NOT_PRODUCT(Atomic::inc(&_total_count);)
>> 230 }
>> 231 }
>>
>> 246 // Atomically increment while checking for zero, zero is bad.
>> 247 bool Symbol::try_increment_refcount() {
>> 248 uint32_t old_value = _length_and_refcount; // fetch once
>> 249 int refc = extract_refcount(old_value);
>> 250
>> 251 if (refc == PERM_REFCOUNT) {
>> 252 return true;
>> 253 } else if (refc == 0) {
>> 254 return false; // effectively dead, can't revive
>> 255 }
>> 256
>> 257 uint32_t now;
>> 258 while ((now = Atomic::cmpxchg(old_value + 1,
>> &_length_and_refcount, old_value)) != old_value) {
>> 259 // failed to increment, check refcount again.
>> 260 refc = extract_refcount(now);
>> 261 if (refc == 0) {
>> 262 return false; // just died
>> 263 } else if (refc == PERM_REFCOUNT) {
>> 264 return true; // just became permanent
>> 265 }
>> 266 old_value = now; // refcount changed, try again
>> 267 }
>> 268 return true;
>> 269 }
>>
>>
>> So is it valid for Symbol::try_increment_refcount() to return false?
>> SymbolTable::lookup_dynamic() seems to suggest YES, but
>> Symbol::increment_refcount() seems to suggest NO.
>
> True. If you are looking up a symbol and someone other thread has
> decremented the refcount to zero, this symbol should not be returned.
> My test exercises this code even without the concurrent hashtable.
> When the hashtable is concurrent, a zero-ed Symbol could be
> deallocated so we don't want to return it.
>
I think the following should be added as a comment in increment_refcount().
> In the case where you call increment_refcount() not during lookup, it
> is assumed that you have a symbol with a non-zero refcount and it
> can't go away while you are holding it.
>>
>> If it's always an invalid condition, I think the fatal() should be
>> moved inside try_increment_refcount.
>>
>
> It isn't fatal at lookup. The lookup must skip a zero-ed entry.
>> Otherwise, I think you need to add comments in all 3 places, to say
>> when it's possible to get a 0 refcount, and when it's not. And, it
>> might be worth expanding on why "zero is bad" :-)
>
> How about this comment to try_increment_refcount:
>
> // Increment refcount while checking for zero. If the Symbol's
> refcount becomes zero
> // a thread could be concurrently removing the Symbol. This is used
> during SymbolTable
> // lookup to avoid reviving a dead Symbol.
Sounds good.
Thanks
- Ioi
>>
>> My guess is:
>> + if you're doing a lookup, you might be seeing Symbols that have
>> already been marked for deletion, which is indicated by a 0 refcount.
>> You want to skip such Symbols.
>>
>> + if you're incrementing the refcount, that means you're holding a
>> valid Symbol, which means this Symbol should have never been marked
>> for deletion.
>>
>> Is this correct?
>
> Yes, both true.
>
> Thanks,
> Coleen
>>
>> Thanks
>> - Ioi
>>
>>
>> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Gerard, thank you for the code review.
>>>
>>> On 7/17/18 4:13 PM, Gerard Ziemski wrote:
>>>> Thank you Coleen (and Kim)!
>>>>
>>>> #1 Need copyright year updates:
>>>>
>>>> src/hotspot/share/oops/symbol.cpp
>>>> src/hotspot/share/classfile/symbolTable.cpp
>>>> src/hotspot/share/classfile/compactHashtable.inline.hpp
>>>
>>> Yes, I'll update with my commit.
>>>>
>>>> #2 What’s the purpose of this code in
>>>> src/hotspot/share/oops/symbol.cpp
>>>>
>>>> 38 STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1));
>>>>
>>>> when we have:
>>>>
>>>> 117 enum {
>>>> 118 // max_symbol_length is constrained by type of _length
>>>> 119 max_symbol_length = (1 << 16) -1
>>>> 120 };
>>>>
>>>> Wouldn’t that always be true? Is it to make sure that nobody
>>>> changes max_symbol_length, because the implementation needs it to
>>>> be that? If so, should we add comment to:
>>>>
>>>> 119 max_symbol_length = (1 << 16) -1
>>>>
>>>> with a big warning of some sorts?
>>>
>>> Yes, it's so that we can store the length of the symbol into 16 bits.
>>>
>>> How I change the comment above max_symbol_length from:
>>>
>>> // max_symbol_length is constrained by type of _length
>>>
>>> to
>>>
>>> // max_symbol_length must fit into the top 16 bits of
>>> _length_and_refcount
>>>
>>>>
>>>> #3 If we have:
>>>>
>>>> 39 STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1));
>>>>
>>>> then why not
>>>>
>>>> 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff
>>>>
>>>> or
>>>> 39 STATIC_ASSERT(PERM_REFCOUNT == 0xffff;
>>>> 101 #define PERM_REFCOUNT 0xffff
>>>>
>>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent.
>>>
>>>> #4 We have:
>>>>
>>>> 221 void Symbol::increment_refcount() {
>>>> 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>> 223 if (!try_increment_refcount()) {
>>>> 224 #ifdef ASSERT
>>>> 225 print();
>>>> 226 #endif
>>>> 227 fatal("refcount has gone to zero");
>>>>
>>>> but
>>>>
>>>> 233 void Symbol::decrement_refcount() {
>>>> 234 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>>> 235 int new_value = Atomic::sub((uint32_t)1,
>>>> &_length_and_refcount);
>>>> 236 #ifdef ASSERT
>>>> 237 // Check if we have transitioned to 0xffff
>>>> 238 if (extract_refcount(new_value) == PERM_REFCOUNT) {
>>>> 239 print();
>>>> 240 fatal("refcount underflow");
>>>> 241 }
>>>> 242 #endif
>>>>
>>>> Where the line:
>>>>
>>>> 240 fatal("refcount underflow”);
>>>>
>>>> is inside #ifdef ASSERT, but:
>>>>
>>>> 227 fatal("refcount has gone to zero”);
>>>>
>>>> is outside. Shouldn't “fatal" be consistent in both?
>>>>
>>>
>>> I was thought that looked strange too. I'll move the #endif from
>>> 226 to after 227.
>>>
>>> Thank you for reviewing the code!
>>> Coleen
>>>
>>>> cheers
>>>>
>>>>
>>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and
>>>>> pack refcount and length into an int.
>>>>>
>>>>> This is a precurser change to the concurrent SymbolTable change.
>>>>> Zeroed refcounted entries can be deleted at anytime so they cannot
>>>>> be allowed to be zero in runtime code. Thanks to Kim for writing
>>>>> the packing function and helping me avoid undefined behavior.
>>>>>
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359
>>>>>
>>>>> Tested with solaris ptrace helper, mach5 tier1-5 including
>>>>> solaris. Added multithreaded gtest which exercises the code.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list