RFR(S) 8021335: Missing synchronization when reading counters for live threads and peak thread count

David Holmes david.holmes at oracle.com
Thu Oct 18 07:27:29 UTC 2018


Hi Dean,

On 18/10/2018 2:06 PM, dean.long at oracle.com wrote:
> On 10/17/18 7:07 PM, David Holmes wrote:
>> Hi Dean,
>>
>> This still seems racy to me. We increment all counts under the 
>> Threads_lock but we still decrement without the Threads_lock. So we 
>> can lose updates to the perfCounters.
>>
>>  117   _total_threads_count->inc();
>>  118   Atomic::inc(&_atomic_threads_count);
>>  119   int count = _atomic_threads_count;
>> <= context switch here
>>  120   _live_threads_count->set_value(count);
>>
>> If a second thread now exits while the above thread is descheduled, it 
>> will decrement _atomic_threads_count and _live_threads_count, but when 
>> the first thread resumes at line 120 above we will set 
>> _live_threads_count to the wrong value!
>>
>> You can't maintain two counters in sync without all changes using 
>> locking across both.
>>
> 
> You're right, I missed that.  I think the right thing to do is call 
> current_thread_exiting while holding the Threads_lock.
> Then we can get rid of the parallel atomic counters.  So, here's one 
> more try:
> 
>      http://cr.openjdk.java.net/~dlong/8021335/webrev.7/

Okay that is the simple and obvious solution that doesn't require split 
counts. So I have to ask Mandy if she recalls why this approach wasn't 
taken 15 years ago when the exit counts were added as part of:

https://bugs.openjdk.java.net/browse/JDK-4530538 ?

Does taking the Threads_lock here cost too much and cause a thread 
termination bottleneck?

Thanks,
David
-----



> dl
> 
>> David
>>
>>
>>
>> On 18/10/2018 8:18 AM, dean.long at oracle.com wrote:
>>> On 10/17/18 2:38 PM, Mandy Chung wrote:
>>>>
>>>>
>>>> On 10/17/18 2:13 PM, dean.long at oracle.com wrote:
>>>>> On 10/17/18 1:41 PM, Mandy Chung wrote:
>>>>>>
>>>>>>
>>>>>> On 10/16/18 7:33 PM, David Holmes wrote:
>>>>>>> Hi Dean,
>>>>>>>
>>>>>>> Thanks for tackling this.
>>>>>>>
>>>>>>> I'm still struggling to fully grasp why we need both the 
>>>>>>> PerfCounters and the regular counters. I get that we have to 
>>>>>>> decrement the live counts before ensure_join() has allowed 
>>>>>>> Thread.join() to return, to ensure that if we then check the 
>>>>>>> number of threads it has dropped by one. But I don't understand 
>>>>>>> why that means we need to manage the thread count in two parts. 
>>>>>>> Particularly as now you don't use the PerfCounter to return the 
>>>>>>> live count, so it makes me wonder what role the PerfCounter is 
>>>>>>> playing as it is temporarily inconsistent with the reported live 
>>>>>>> count? 
>>>>>>
>>>>>> Perf counters were added long time back in JDK 1.4.2 for 
>>>>>> performance measurement before java.lang.management API. One can 
>>>>>> use jstat tool to monitor VM perf counters of a running VM.   One 
>>>>>> could look into the possibility of deprecating these counters and 
>>>>>> remove them over time.
>>>>>>
>>>>>>> On 17/10/2018 9:43 AM, dean.long at oracle.com wrote:
>>>>>>> New webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dlong/8021335/webrev.4/
>>>>>>
>>>>>> When the perf counters are updated when a thread is added/removed, 
>>>>>> it's holding Threads_lock.  Are the asserts in 
>>>>>> ThreadService::remove_thread necessary?
>>>>>>
>>>>>
>>>>> Not really.  They were intended to catch the case where the atomic 
>>>>> counters weren't decremented for some reason, not for the perf 
>>>>> counters.
>>>>> Should I remove them?
>>>>>
>>>>
>>>> Hmm... when remove_thread is called but decrement_thread_counts has 
>>>> not been called.   It's a bug in thread accounting.  It happens to 
>>>> have the perf counters that can be compared to assert. It seems not 
>>>> obvious.  Setting the perf counters same values as 
>>>> _atomic_threads_count and _atomic_daemon_threads_count makes sense 
>>>> to me.
>>>>
>>>> I would opt for removing the asserts but I can't think of an 
>>>> alternative how to catch the issue you concern about.
>>>>
>>>>>> For clarify, I think we could simply set _live_threads_count to 
>>>>>> the value of _atomic_threads_count and set _daemon_threads_count 
>>>>>> to the value of _atomic_daemon_threads_count.
>>>>>>
>>>>>
>>>>> I think that works, even inside decrement_thread_counts() without 
>>>>> holding the Threads_lock.  If you agree, I'll make that change.
>>>>>
>>>> +1
>>>>
>>>
>>> New webrevs, full and incremental:
>>>
>>> http://cr.openjdk.java.net/~dlong/8021335/webrev.6/
>>> http://cr.openjdk.java.net/~dlong/8021335/webrev.6.diff/
>>>
>>> I like it better without all the asserts too.
>>>
>>> dl
>>>
>>>> Mandy
>>>
> 


More information about the serviceability-dev mailing list