RFR (XS) 8161280 - assert failed: reference count underflow for symbol

Ioi Lam ioi.lam at oracle.com
Wed Aug 24 01:03:07 UTC 2016


Hi Coleen, thanks for suggestion the simplification:

void Symbol::decrement_refcount() {
#ifdef ASSERT
   if (_refcount == 0) {
       print();
       assert(false, "reference count underflow for symbol");
     }
   }
#endif
   Atomic::dec(&_refcount);
}

There's a race condition that won't detect the underflow. E.g., refcount 
is 1. Two threads comes in and decrement at the same time. We will end 
up with -1.

However, it's not worse than before. The old version also has a race 
condition:

     refcount is 0
     thread A decrements
     thread B increments
     thread A checks for underflow

the decrementing thread will read _refcount==0 at the end so it won't 
detect the (transient) underflow.

I think the failure to detect underflow is fine, since this happens only 
with concurrent access. The kinds of underflow that we are interested 
usually can be caught in single-threaded situations.

Thanks
- Ioi




On 8/23/16 4:24 PM, Coleen Phillimore wrote:
>
> This doesn't make sense for me and I have to go in gdb to print out 
> what -16384 is.   It appears that this is trying to detect that we 
> went below zero from zero, which is an error, but this isn't clear at 
> all.
>
> It seems that
>
>    if (_refcount >= 0) {
>
>
> Should be > 0 and we should assert if this is ever zero instead, and 
> allow anything negative to mean that this count has gone immortal.
>
> Kim thought it should use CAS rather than atomic increment and 
> decrement, but maybe that isn't necessary, especially since there 
> isn't a short version of cmpxchg.
>
> thanks,
> Coleen
>
> On 8/23/16 6:01 AM, Ioi Lam wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8161280
>> http://cr.openjdk.java.net/~iklam/jdk9/8161280-symbol-refcount-underflow.v01/ 
>>
>>
>> Summary:
>>
>> The test was loading a lot of JCK classes into the same VM. Many of 
>> the JCK classes refer to "javasoft/sqe/javatest/Status", so the 
>> refcount (a signed short integer) of this Symbol would run up and 
>> past 0x7fff.
>>
>> The assert was caused by a race condition: the refcount started with 
>> a large (16-bit) positive value such as 0x7fff, one thread is 
>> decrementing and several other threads are incrementing. The refcount 
>> will end up being 0x8000 or slightly higher (limited to the number of 
>> concurrent threads that are running within a small window of several 
>> instructions in the decrementing thread, so most likely it will be 
>> 0x800?).
>>
>> As a result, the decrementing thread found that the refecount is 
>> negative after the operation, and thought that an underflow had 
>> happened.
>>
>> The fix is to ignore any value that may appear in the [0x8000 - 
>> 0xbfff] range and do not flag these as underflows (since they are 
>> most likely overflows -- overflows are already handled by making the 
>> Symbol permanent).
>>
>> Thanks
>> - Ioi
>>
>>
>



More information about the hotspot-runtime-dev mailing list