RFR(S) 8021335: Missing synchronization when reading counters for live threads and peak thread count

dean.long at oracle.com dean.long at oracle.com
Tue Oct 23 22:30:35 UTC 2018


On 10/23/18 2:51 PM, David Holmes wrote:
> Hi Dean,
>
> On 24/10/2018 4:05 AM, dean.long at oracle.com wrote:
>> On 10/23/18 9:46 AM, dean.long at oracle.com wrote:
>>> On 10/22/18 3:31 PM, David Holmes wrote:
>>>> Sorry Dean I'm concerned about a thread termination bottleneck with 
>>>> this. A simple microbenchmark that creates 500,000 threads that 
>>>> have to run and terminate, shows a 15+% slowdown on my Linux box. I 
>>>> tried to find some kind of real benchmarks that covers thread 
>>>> termination but couldn't see anything specific.
>>>>
>>>> Can you at least run this through our performance system to see if 
>>>> any of the regular benchmarks are affected.
>>>>
>>>
>>> OK, but even if the regular benchmarks don't show a difference, I'd 
>>> feel better if microbenchmarks were not affected.  What if I go back 
>>> to the original approach and add locking:
>>>
>>>    static jlong get_live_thread_count()        { MutexLocker 
>>> mu(Threads_lock); return _live_threads_count->get_value() - 
>>> _exiting_threads_count; }
>>>    static jlong get_daemon_thread_count()      { MutexLocker 
>>> mu(Threads_lock); return _daemon_threads_count->get_value() - 
>>> _exiting_daemon_threads_count; }
>>>
>>> along with the other cleanups around is_daemon and is_hidden_thread?
>>>
>>
>> Some micro-benchmarks like SecureRandomBench show a regression with 
>> webrev.7.  I could go back to webrev.5 and then we shouldn't need any 
>> locking in the get_*() functions.
>
> I don't see version 5 discussed but I took a look and it seems okay. 

Mandy had questions about the asserts in .5, and it seemed like we could 
just set the perf counters to the same value as the atomic counters, 
which resulted in .6.  I think the only problem with .6 is that I set 
the perf counters in decrement_thread_counts without the lock.  If I 
"sync" the perf counters to the atomic counters only in add_thread and 
remove_thread, with the lock, then it's about the same as .5, but 
without the asserts and parallel inc/dec.  If anyone likes the sound of 
that, I can send out a new webrev.  Or we can go with webrev.5.

> My only query with that version is that it appears the actual 
> perfCounters no longer get read by anything - is that the case?
>

There does seem to be code that references them, through their name 
string.  That's a difference interface that I'm not familiar with, so I 
didn't want to break it.

dl

> Thanks,
> David
>
>> dl
>>
>>> dl
>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 20/10/2018 1:28 PM, dean.long at oracle.com wrote:
>>>>> On 10/18/18 6:12 PM, Mandy Chung wrote:
>>>>>>
>>>>>>
>>>>>> On 10/18/18 12:27 AM, David Holmes wrote:
>>>>>>> Hi Dean,
>>>>>>>
>>>>>>> On 18/10/2018 2:06 PM, dean.long at oracle.com wrote:
>>>>>>>>
>>>>>>>> You're right, I missed that.  I think the right thing to do is 
>>>>>>>> call current_thread_exiting while holding the Threads_lock.
>>>>>>>> Then we can get rid of the parallel atomic counters. So, here's 
>>>>>>>> one more try:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dlong/8021335/webrev.7/
>>>>>>>
>>>>>>> Okay that is the simple and obvious solution that doesn't 
>>>>>>> require split counts. So I have to ask Mandy if she recalls why 
>>>>>>> this approach wasn't taken 15 years ago when the exit counts 
>>>>>>> were added as part of:
>>>>>>>
>>>>>>
>>>>>> It has been so long.  I think it's likely an oversight.
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-4530538 ?
>>>>>>>
>>>>>>> Does taking the Threads_lock here cost too much and cause a 
>>>>>>> thread termination bottleneck?
>>>>>>
>>>>>> If the contention on Threads_lock is not high (that seems to me), 
>>>>>> it should be okay.   I'm not close to the VM implementation (lot 
>>>>>> of changes since then) and I don't have a definitive answer 
>>>>>> unless I study the code closely.   You and others have a better 
>>>>>> judgement on this.
>>>>>>
>>>>>> AFAICT the change is okay.
>>>>>>
>>>>>
>>>>> Thanks Mandy.  David, OK to push?
>>>>>
>>>>> dl
>>>>>
>>>>>> Mandy
>>>>>>
>>>>>>
>>>>>
>>>
>>



More information about the serviceability-dev mailing list