RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently
Jaroslav Bachorik
jaroslav.bachorik at oracle.com
Tue Jul 23 03:23:38 PDT 2013
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 07/23/2013 11:39 AM, David Holmes wrote:
> Sorry - I took a closer look at the full test rather than just
> that patch. We already have this code to try and help expose these
> intermittent failures:
>
> 213 // Nightly testing showed some intermittent failure.
> 214 // Check here to get diagnostic information if some
> strange 215 // behavior occurs. 216
> checkThreadCount(expectedCount, current, 0);
Unfortunately, this does not help to get any closer to the culprit.
Until the code gets to the point of making the thread dump the
offending thread is gone. So you only get the information that
something went wrong.
- -JB-
>
> but the sleep loop you added means this check will rarely fail so
> we won't get to see this unexpected behaviour happening. So this
> block of code could be deleted in my view. Though it is preferable
> to determine exactly why we fail!
>
> Also looking at the sleep() used elsewhere you may as well follow
> the same pattern and abort on interrupt as it isn't expected.
>
> Finally with regard to Daniel's comment about the live array he is
> right that the volatile on the array is not sufficient in theory -
> a thread need never see the value of live[i] become false. There
> are a number of reasons why we are unlikely to see that in practice
> on hotspot. Using synchronized will fix that; or an alternative
> cancellation mechanism could be used.
>
> Cheers, David
>
> On 23/07/2013 7:19 PM, David Holmes wrote:
>> On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote:
>>> On 07/23/2013 10:19 AM, David Holmes wrote:
>>>> Hi Jaroslav,
>>>>
>>>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote:
>>>>> The
>>>>> java/lang/management/ThreadMXBean/ResetPeakThreadCount.java
>>>>> test seems to be failing intermittently.
>>>>>
>>>>> The test checks the functionality of the
>>>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does
>>>>> so by capturing the current value of
>>>>> "getPeakThreadCount()", starting a predefined number of the
>>>>> user threads, stopping them and resetting the stored peak
>>>>> value and making sure the new peak equals to the number of
>>>>> the actually running threads.
>>>>>
>>>>> The main problem is that it is not possible to prevent JVM
>>>>> to start/stop arbitrary system threads while executing the
>>>>> test. This might lead to small variations of the reported
>>>>> peak (a short-lived system thread is started while the
>>>>> batch of the user threads is running) or the expected
>>>>> number of running threads (again, a short-lived system
>>>>> thread is started at the moment the test asks for the
>>>>> number of running threads).
>>>>
>>>> Do you know what "system threads" these are? I would not
>>>> expect VM internal threads to be counted in
>>>> getPeakThreadCount(), but even if they are I can't think of
>>>> any short-lived threads that get created other than the
>>>> Signal handling thread.
>>>
>>> Unfortunatelly I don't. Capturing the thread dump at the moment
>>> of discovering the discrepancy seems to to be too late. I tried
>>> monitoring the JVM under the test from external tools but it
>>> just brings more entropy to the result.
>>
>> We'd need to instrument the thread creation logic to keep a
>> separate record. Dtrace probes could probably do it - but the
>> problem is getting the test to fail.
>>
>>> I am completely relying on the JVM native thread accounting to
>>> be correct and accurate - that it reports the thread count peak
>>> based on the real data.
>>
>> The spec isn't clear but I would only expect these counters to
>> apply to Java threads not VM internal threads (compiler, gc etc).
>> So I'd really like to know what thread is messing up this count.
>>
>> David
>>
>>> -JB-
>>>
>>>>
>>>>> The patch does not fix those shortcomings as it is not
>>>>> really possible to do given the nature of the JVM threading
>>>>> system. It rather tries to relax the conditions while still
>>>>> maintaining the ability to detect functional problems - eg.
>>>>> decreasing peak without explicitly resetting it and
>>>>> reporting false number of threads.
>>>>>
>>>>> The webrev is at:
>>>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00
>>>>
>>>> Seems reasonable.
>>>>
>>>> David -----
>>>>
>>>>> Thanks,
>>>>>
>>>>> -JB-
>>>>>
>>>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBAgAGBQJR7lmqAAoJELSZyqhGiB1MdS8IAJEgnUI83ZQNYP2Md6vMe4C+
kGRgls2ml9x9ljwqMHnreOjww7pzyXeDKoX1vR09OD6znDUIuHkvjIOD8QRjFnjz
/E0uBnoaIIhREuvbopq4dHFXU0wPPK9VnU6OgGUtTKU0aqk9256NMJwprO06CrXa
TZlmUljgk3rci7pE9ZA7Up4+3Qr0tWPn5EjLAVG/UmAvC5zNptsAZcYjf8i9yQ+1
9Hp+4xY68i9QffdE3bNEAWGTQGkNy2rF4HHwSorxnruUHgi3yTxxbykJ2pBgDgYl
3IwnbrwWxNOOPW3h5DLaqCjdromCBfzYbm4xmY6Tbcxfvh0LR8QWm5eCfE151Ss=
=MYqb
-----END PGP SIGNATURE-----
More information about the serviceability-dev
mailing list