RFR (M) 8207359: Make SymbolTable increment_refcount disallow zero
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Wed Jul 18 21:45:35 UTC 2018
On 7/18/18 5:14 PM, Ioi Lam wrote:
> Hi Coleen,
>
> The changes look good! The new operations on _length_and_refcount are
> much cleaner than my old ATOMIC_SHORT_PAIR hack.
Yes, this makes more sense to me.
>
> symbolTable.cpp:
>
> SymbolTable::lookup_dynamic() {
> ...
> 214 Symbol* sym = e->literal();
> 215 if (sym->equals(name, len) && sym->try_increment_refcount()) {
> 216 // something is referencing this symbol now.
> 217 return sym;
> 218 }
>
>
> symbol.cpp:
>
> 221 void Symbol::increment_refcount() {
> 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
> 223 if (!try_increment_refcount()) {
> 224 #ifdef ASSERT
> 225 print();
> 226 #endif
> 227 fatal("refcount has gone to zero");
> 228 }
> 229 NOT_PRODUCT(Atomic::inc(&_total_count);)
> 230 }
> 231 }
>
> 246 // Atomically increment while checking for zero, zero is bad.
> 247 bool Symbol::try_increment_refcount() {
> 248 uint32_t old_value = _length_and_refcount; // fetch once
> 249 int refc = extract_refcount(old_value);
> 250
> 251 if (refc == PERM_REFCOUNT) {
> 252 return true;
> 253 } else if (refc == 0) {
> 254 return false; // effectively dead, can't revive
> 255 }
> 256
> 257 uint32_t now;
> 258 while ((now = Atomic::cmpxchg(old_value + 1,
> &_length_and_refcount, old_value)) != old_value) {
> 259 // failed to increment, check refcount again.
> 260 refc = extract_refcount(now);
> 261 if (refc == 0) {
> 262 return false; // just died
> 263 } else if (refc == PERM_REFCOUNT) {
> 264 return true; // just became permanent
> 265 }
> 266 old_value = now; // refcount changed, try again
> 267 }
> 268 return true;
> 269 }
>
>
> So is it valid for Symbol::try_increment_refcount() to return false?
> SymbolTable::lookup_dynamic() seems to suggest YES, but
> Symbol::increment_refcount() seems to suggest NO.
True. If you are looking up a symbol and someone other thread has
decremented the refcount to zero, this symbol should not be returned.
My test exercises this code even without the concurrent hashtable. When
the hashtable is concurrent, a zero-ed Symbol could be deallocated so we
don't want to return it.
In the case where you call increment_refcount() not during lookup, it is
assumed that you have a symbol with a non-zero refcount and it can't go
away while you are holding it.
>
> If it's always an invalid condition, I think the fatal() should be
> moved inside try_increment_refcount.
>
It isn't fatal at lookup. The lookup must skip a zero-ed entry.
> Otherwise, I think you need to add comments in all 3 places, to say
> when it's possible to get a 0 refcount, and when it's not. And, it
> might be worth expanding on why "zero is bad" :-)
How about this comment to try_increment_refcount:
// Increment refcount while checking for zero. If the Symbol's refcount
becomes zero
// a thread could be concurrently removing the Symbol. This is used
during SymbolTable
// lookup to avoid reviving a dead Symbol.
>
> My guess is:
> + if you're doing a lookup, you might be seeing Symbols that have
> already been marked for deletion, which is indicated by a 0 refcount.
> You want to skip such Symbols.
>
> + if you're incrementing the refcount, that means you're holding a
> valid Symbol, which means this Symbol should have never been marked
> for deletion.
>
> Is this correct?
Yes, both true.
Thanks,
Coleen
>
> Thanks
> - Ioi
>
>
> On 7/17/18 2:08 PM, coleen.phillimore at oracle.com wrote:
>>
>> Gerard, thank you for the code review.
>>
>> On 7/17/18 4:13 PM, Gerard Ziemski wrote:
>>> Thank you Coleen (and Kim)!
>>>
>>> #1 Need copyright year updates:
>>>
>>> src/hotspot/share/oops/symbol.cpp
>>> src/hotspot/share/classfile/symbolTable.cpp
>>> src/hotspot/share/classfile/compactHashtable.inline.hpp
>>
>> Yes, I'll update with my commit.
>>>
>>> #2 What’s the purpose of this code in src/hotspot/share/oops/symbol.cpp
>>>
>>> 38 STATIC_ASSERT(max_symbol_length == ((1 << 16) - 1));
>>>
>>> when we have:
>>>
>>> 117 enum {
>>> 118 // max_symbol_length is constrained by type of _length
>>> 119 max_symbol_length = (1 << 16) -1
>>> 120 };
>>>
>>> Wouldn’t that always be true? Is it to make sure that nobody
>>> changes max_symbol_length, because the implementation needs it to be
>>> that? If so, should we add comment to:
>>>
>>> 119 max_symbol_length = (1 << 16) -1
>>>
>>> with a big warning of some sorts?
>>
>> Yes, it's so that we can store the length of the symbol into 16 bits.
>>
>> How I change the comment above max_symbol_length from:
>>
>> // max_symbol_length is constrained by type of _length
>>
>> to
>>
>> // max_symbol_length must fit into the top 16 bits of
>> _length_and_refcount
>>
>>>
>>> #3 If we have:
>>>
>>> 39 STATIC_ASSERT(PERM_REFCOUNT == ((1 << 16) - 1));
>>>
>>> then why not
>>>
>>> 101 #define PERM_REFCOUNT ((1 << 16) - 1)) // 0xffff
>>>
>>> or
>>> 39 STATIC_ASSERT(PERM_REFCOUNT == 0xffff;
>>> 101 #define PERM_REFCOUNT 0xffff
>>>
>> I can change PERM_REFCOUNT to ((1 << 16)) -1) to be consistent.
>>
>>> #4 We have:
>>>
>>> 221 void Symbol::increment_refcount() {
>>> 222 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>> 223 if (!try_increment_refcount()) {
>>> 224 #ifdef ASSERT
>>> 225 print();
>>> 226 #endif
>>> 227 fatal("refcount has gone to zero");
>>>
>>> but
>>>
>>> 233 void Symbol::decrement_refcount() {
>>> 234 if (refcount() != PERM_REFCOUNT) { // not a permanent symbol
>>> 235 int new_value = Atomic::sub((uint32_t)1,
>>> &_length_and_refcount);
>>> 236 #ifdef ASSERT
>>> 237 // Check if we have transitioned to 0xffff
>>> 238 if (extract_refcount(new_value) == PERM_REFCOUNT) {
>>> 239 print();
>>> 240 fatal("refcount underflow");
>>> 241 }
>>> 242 #endif
>>>
>>> Where the line:
>>>
>>> 240 fatal("refcount underflow”);
>>>
>>> is inside #ifdef ASSERT, but:
>>>
>>> 227 fatal("refcount has gone to zero”);
>>>
>>> is outside. Shouldn't “fatal" be consistent in both?
>>>
>>
>> I was thought that looked strange too. I'll move the #endif from 226
>> to after 227.
>>
>> Thank you for reviewing the code!
>> Coleen
>>
>>> cheers
>>>
>>>
>>>> On Jul 17, 2018, at 10:51 AM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Summary: Use cmpxchg for non permanent symbol refcounting, and pack
>>>> refcount and length into an int.
>>>>
>>>> This is a precurser change to the concurrent SymbolTable change.
>>>> Zeroed refcounted entries can be deleted at anytime so they cannot
>>>> be allowed to be zero in runtime code. Thanks to Kim for writing
>>>> the packing function and helping me avoid undefined behavior.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8207359.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8207359
>>>>
>>>> Tested with solaris ptrace helper, mach5 tier1-5 including solaris.
>>>> Added multithreaded gtest which exercises the code.
>>>>
>>>> Thanks,
>>>> Coleen
>>
>
More information about the hotspot-runtime-dev
mailing list