jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently
David Holmes
david.holmes at oracle.com
Wed Jul 24 22:07:01 PDT 2013
On 25/07/2013 12:08 AM, Jaroslav Bachorik wrote:
> On 07/24/2013 03:17 PM, Chris Hegarty wrote:
>> On 24/07/2013 13:49, Jaroslav Bachorik wrote:
>>> On 07/24/2013 02:32 PM, Chris Hegarty wrote:
>>>> On 24/07/2013 12:21, David Holmes wrote:
>>>>> On 24/07/2013 7:31 PM, Mandy Chung wrote:
>>>>>>
>>>>>> On 7/24/2013 4:50 PM, shanliang wrote:
>>>>>>> So we have 2 kinds of issues here:
>>>>>>> 1) the test related, like Thread state checking, we can fix them in
>>>>>>> the test
>>>>>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add
>>>>>>> your test case to the bug), and add a workaround (sleep or call 2
>>>>>>> times) in the test to make the test pass. Mandy is the expert and
>>>>>>> better to get her opinion.
>>>>>>
>>>>>> It's probably a race in the VM implementation in determining the
>>>>>> thread
>>>>>> count. You will need to diagnose the VM implementation and compare the
>>>>>> thread list and the implementation of getting the thread count (check
>>>>>> hotspot/src/share/vm/services/threadService.cpp)
>>>>>
>>>>> There is a considerable code path between the point where a terminating
>>>>> thread causes Thread.join() to be allowed to return, and the point
>>>>> where
>>>>> the live thread count gets decremented. So using join() does not help
>>>>> here. Arguably JVMTI should have based its counts around the lifecycle
>>>>> of the Java thread not the underlying native thread.
>>>>
>>>> It appears, from my reading of the code, that this situation ( a thread
>>>> exiting ) should be handled. Or maybe I'm looking at the wrong
>>>> interface.
>>>>
>>>> JavaThread::exit(...) {
>>>> ...
>>>> ThreadService::current_thread_exiting(this);
>>>> ...
>>>> ensure_join(..)
>>>> ...
>>>> }
>>>>
>>>> So the exiting thread should be removed from the live thread count
>>>> before Thread.join returns.
>>>
>>> Unfortunately, ensure_join(...) is called on line 1860 but
>>> Threads::remove(this), which does the actual cleanup of the live threads
>>> counter, is called only on line 1919, leaving at least a few ns window
>>> when the thread is reported as terminated in java but the counters
>>> haven't been updated yet.
>>
>> Again, maybe I'm missing something but,
>>
>> static jlong get_live_thread_count() { return
>> _live_threads_count->get_value() - _exiting_threads_count; }
>>
>> ... and current_thread_exiting(..) increments _exiting_threads_count, no?
>
> Well, apparently it does.
Yes. Thanks Chris I completely missed the use of the
_exiting_threads_count to address this very issue.
> I am a complete stranger to the concurrency issues in the hotspot -
> would it be possible that in ThreadService::remove_thread(..) the
> _exiting_threads_count is decremented but _live_threads_count hasn't
> been updated yet when someone calls the get_live_thread_count() function?
Yes. Updates are guarded by acquiring the Threads_lock, but reads are
not. So it is indeed possible to request the live count between the
decrement of the exiting count and the decrement of the live count
itself. Mind you that is an extremely small window of opportunity in
terms of this bug manifesting as often as it does.
Because get_live_thread_count returns the sum of two variables it has to
use the same synchronization as is used to update those variables to
ensure it returns a valid value. We can't grab the Threads_lock directly
in get_live_thread_count as it is already called from code that holds
the lock. So we would have to push this out to management.cpp's
get_long_attribute.
David
-----
> -JB-
>
>>
>> -Chris.
>>
>>>
>>> -JB-
>>>
>>>>
>>>> -Chris.
>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>> Mandy
>>>
>
More information about the serviceability-dev
mailing list