RFR(S) 8021335: Missing synchronization when reading counters for live threads and peak thread count
David Holmes
david.holmes at oracle.com
Wed Oct 17 02:33:55 UTC 2018
Hi Dean,
Thanks for tackling this.
I'm still struggling to fully grasp why we need both the PerfCounters
and the regular counters. I get that we have to decrement the live
counts before ensure_join() has allowed Thread.join() to return, to
ensure that if we then check the number of threads it has dropped by
one. But I don't understand why that means we need to manage the thread
count in two parts. Particularly as now you don't use the PerfCounter to
return the live count, so it makes me wonder what role the PerfCounter
is playing as it is temporarily inconsistent with the reported live
count? Is it simply that we can't atomically decrement the PerfCounters,
and we can't require the Threads_lock when decrementing?
On the code itself one thing I find irksome is that we already have a
daemon flag in remove_thread but decrement_thread_counts doesn't get
passed that and so always re-examines the thread to see if it is a
daemon. I know only one of the code paths has the daemon flag, so
perhaps we should hoist the daemon check up into JavaThread::exit and
then pass a daemon parameter to decrement_thread_counts.
There's also some ambiguity (existing) in relation to the existence of
the jt->threadObj() and its affect on whether the thread is considered a
daemon or not. Your check has to mirror the existing checks - the
threadObj may be NULL at removal if it was a failed attach, but the
addition considers a JNI-attached thread to be non-daemon. Overall this
seems buggy to me but as long as your check follows the same rules that
is okay. In that regard JavaThread::exit should never encounter a NULL
threadObj().
Minor nit: I suggest moving the two occurrences of the comment
// Do not count VM internal or JVMTI agent threads
inside is_hidden_thread so that we don't have to remember to update the
comments if the definition changes.
Thanks,
David
-----
On 17/10/2018 9:43 AM, dean.long at oracle.com wrote:
> New webrev:
>
> http://cr.openjdk.java.net/~dlong/8021335/webrev.4/
>
> dl
>
> On 10/16/18 11:59 AM, dean.long at oracle.com wrote:
>> On 10/16/18 10:28 AM, JC Beyler wrote:
>>> Hi Dean,
>>>
>>> I noticed a typo here :
>>> "are atomically" is in the comment but should no longer be there I
>>> believe; the sentence probably should be:
>>> // These 2 counters are like the above thread counts, but are
>>>
>>
>> Fixed!
>>
>>> Any way we could move this test into a helper method and both
>>> add_thread/current_thread_exiting could use the same "ignore" thread
>>> code so that we ensure the two never get out of sync?
>>>
>>> + if (jt->is_hidden_from_external_view() ||
>>> + jt->is_jvmti_agent_thread()) {
>>> + return;
>>> + }
>>>
>>
>> Good idea. Fixed.
>>
>>> (Also by curiosity, add_thread has an assert on the Threads_lock but
>>> not thread_exiting?)
>>>
>>
>> Right, I followed the existing pattern where current_thread_exiting()
>> only uses the atomic counters, so it doesn't need Threads_lock.
>>
>> dl
>>
>>> Thanks,
>>> Jc
>>>
>>>
>>> On Tue, Oct 16, 2018 at 9:52 AM <dean.long at oracle.com
>>> <mailto:dean.long at oracle.com>> wrote:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8021335
>>> http://cr.openjdk.java.net/~dlong/8021335/webrev.3/
>>> <http://cr.openjdk.java.net/%7Edlong/8021335/webrev.3/>
>>>
>>> This change attempts to fix an old and intermittent problem with
>>> ThreadMXBean thread counts.
>>> Changes have 3 main parts:
>>> 1. Make sure hidden threads don't affect counts (even when exiting)
>>> 2. Fix race reading counts using atomic counters
>>> 3. Remove some workarounds in test (because they hide bugs)
>>>
>>> dl
>>>
>>>
>>>
>>> --
>>>
>>> Thanks,
>>> Jc
>>
>
More information about the hotspot-dev
mailing list