RFR(S) 8021335: Missing synchronization when reading counters for live threads and peak thread count

dean.long at oracle.com dean.long at oracle.com
Thu Oct 18 04:06:41 UTC 2018


On 10/17/18 7:07 PM, David Holmes wrote:
> Hi Dean,
>
> This still seems racy to me. We increment all counts under the 
> Threads_lock but we still decrement without the Threads_lock. So we 
> can lose updates to the perfCounters.
>
>  117   _total_threads_count->inc();
>  118   Atomic::inc(&_atomic_threads_count);
>  119   int count = _atomic_threads_count;
> <= context switch here
>  120   _live_threads_count->set_value(count);
>
> If a second thread now exits while the above thread is descheduled, it 
> will decrement _atomic_threads_count and _live_threads_count, but when 
> the first thread resumes at line 120 above we will set 
> _live_threads_count to the wrong value!
>
> You can't maintain two counters in sync without all changes using 
> locking across both.
>

You're right, I missed that.  I think the right thing to do is call 
current_thread_exiting while holding the Threads_lock.
Then we can get rid of the parallel atomic counters.  So, here's one 
more try:

     http://cr.openjdk.java.net/~dlong/8021335/webrev.7/

dl

> David
>
>
>
> On 18/10/2018 8:18 AM, dean.long at oracle.com wrote:
>> On 10/17/18 2:38 PM, Mandy Chung wrote:
>>>
>>>
>>> On 10/17/18 2:13 PM, dean.long at oracle.com wrote:
>>>> On 10/17/18 1:41 PM, Mandy Chung wrote:
>>>>>
>>>>>
>>>>> On 10/16/18 7:33 PM, David Holmes wrote:
>>>>>> Hi Dean,
>>>>>>
>>>>>> Thanks for tackling this.
>>>>>>
>>>>>> I'm still struggling to fully grasp why we need both the 
>>>>>> PerfCounters and the regular counters. I get that we have to 
>>>>>> decrement the live counts before ensure_join() has allowed 
>>>>>> Thread.join() to return, to ensure that if we then check the 
>>>>>> number of threads it has dropped by one. But I don't understand 
>>>>>> why that means we need to manage the thread count in two parts. 
>>>>>> Particularly as now you don't use the PerfCounter to return the 
>>>>>> live count, so it makes me wonder what role the PerfCounter is 
>>>>>> playing as it is temporarily inconsistent with the reported live 
>>>>>> count? 
>>>>>
>>>>> Perf counters were added long time back in JDK 1.4.2 for 
>>>>> performance measurement before java.lang.management API. One can 
>>>>> use jstat tool to monitor VM perf counters of a running VM.   One 
>>>>> could look into the possibility of deprecating these counters and 
>>>>> remove them over time.
>>>>>
>>>>>> On 17/10/2018 9:43 AM, dean.long at oracle.com wrote:
>>>>>> New webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dlong/8021335/webrev.4/
>>>>>
>>>>> When the perf counters are updated when a thread is added/removed, 
>>>>> it's holding Threads_lock.  Are the asserts in 
>>>>> ThreadService::remove_thread necessary?
>>>>>
>>>>
>>>> Not really.  They were intended to catch the case where the atomic 
>>>> counters weren't decremented for some reason, not for the perf 
>>>> counters.
>>>> Should I remove them?
>>>>
>>>
>>> Hmm... when remove_thread is called but decrement_thread_counts has 
>>> not been called.   It's a bug in thread accounting.  It happens to 
>>> have the perf counters that can be compared to assert. It seems not 
>>> obvious.  Setting the perf counters same values as 
>>> _atomic_threads_count and _atomic_daemon_threads_count makes sense 
>>> to me.
>>>
>>> I would opt for removing the asserts but I can't think of an 
>>> alternative how to catch the issue you concern about.
>>>
>>>>> For clarify, I think we could simply set _live_threads_count to 
>>>>> the value of _atomic_threads_count and set _daemon_threads_count 
>>>>> to the value of _atomic_daemon_threads_count.
>>>>>
>>>>
>>>> I think that works, even inside decrement_thread_counts() without 
>>>> holding the Threads_lock.  If you agree, I'll make that change.
>>>>
>>> +1
>>>
>>
>> New webrevs, full and incremental:
>>
>> http://cr.openjdk.java.net/~dlong/8021335/webrev.6/
>> http://cr.openjdk.java.net/~dlong/8021335/webrev.6.diff/
>>
>> I like it better without all the asserts too.
>>
>> dl
>>
>>> Mandy
>>



More information about the serviceability-dev mailing list