RFR: 8131745: java/lang/management/ThreadMXBean/AllThreadIds.java still fails intermittently
Daniil Titov
daniil.x.titov at oracle.com
Thu Jun 4 23:01:04 UTC 2020
Hi Serguei,
> 201 private static void checkLiveThreads(int numNewThreads,
> 202 int numTerminatedThreads) {
> 203 int diff = numNewThreads - numTerminatedThreads;
> 204 long threadCount = mbean.getThreadCount();
> 205 long expectedThreadCount = prevLiveTestThreadCount + diff;
> 206 if (threadCount < expectedThreadCount) {
> 207 testFailed = true;
> When all threads are counted with mbean.getThreadCount() it is not clear
> there is no race with new non-tested threads creation. Is it possible?
> If so, then the check at line 206 is going to fail.
Even if some Internal (non-tested) threads are created the value mbean.getThreadCount() returns should be no less than the expected number of live test threads (please note that prevLiveTestThreadCount counts only *test* threads) that means that condition on line 206 will be evaluated to *false* and line 207 will not be executed and the test will pass.
--Best regards,
Daniil
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Thursday, June 4, 2020 at 3:03 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, Alex Menkov <alexey.menkov at oracle.com>, serviceability-dev <serviceability-dev at openjdk.java.net>
Subject: Re: RFR: 8131745: java/lang/management/ThreadMXBean/AllThreadIds.java still fails intermittently
Hi Daniil,
It is hard to be on top of all the details in these review rounds.
When all threads are counted with mbean.getThreadCount() it is not clear
there is no race with new non-tested threads creation. Is it possible?
If so, then the check at line 206 is going to fail.
201 private static void checkLiveThreads(int numNewThreads,
202 int numTerminatedThreads) {
203 int diff = numNewThreads - numTerminatedThreads;
204 long threadCount = mbean.getThreadCount();
205 long expectedThreadCount = prevLiveTestThreadCount + diff;
206 if (threadCount < expectedThreadCount) {
207 testFailed = true;
Thanks,
Serguei
On 6/3/20 20:42, Daniil Titov wrote:
Hi Alex,
Please review a new version of the webrev [1] that no longer uses waitTillEquals() method.
[1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.04/
[2] https://bugs.openjdk.java.net/browse/JDK-8131745
Thank you,
Daniil
On 6/3/20, 4:42 PM, "Alex Menkov" mailto:alexey.menkov at oracle.com wrote:
Hi again Daniil,
On 06/03/2020 16:31, Daniil Titov wrote:
> Hi Alex,
>
> Thanks for this suggestion. You are right, we actually don't need this waitForAllThreads() method.
>
> I will include this change in the new version of the webrev.
>
>> 207 int diff = numNewThreads - numTerminatedThreads;
>> 208 long threadCount = mbean.getThreadCount();
>> 209 long expectedThreadCount = prevLiveTestThreadCount + diff;
>> 210 if (threadCount < expectedThreadCount) {
>> if some internal thread terminates, we'll get failure here
>
> The failure will not happen. Please note that prevLiveTestThreadCount counts only *test* threads. Thus even if some Internal threads terminated the value mbean.getThreadCount() returns should still be no less than the expected number of live test threads.
>
> 310 prevLiveTestThreadCount = getTestThreadCount();
Oh, yes, I missed it.
LGTM.
--alex
>
> Best regards,
> Daniil
>
>
> On 6/3/20, 3:08 PM, "Alex Menkov" mailto:alexey.menkov at oracle.com wrote:
>
> Hi Daniil,
>
> couple notes:
>
> 198 waitForThreads(numNewThreads, numTerminatedThreads);
>
> You don't actually need any wait here.
> Test cases wait until all threads are in desired state
> (checkAllThreadsAlive uses startupCheck, checkDaemonThreadsDead and
> checkAllThreadsDead use join())
>
>
> 205 private static void checkLiveThreads(int numNewThreads,
> 206 int numTerminatedThreads) {
> 207 int diff = numNewThreads - numTerminatedThreads;
> 208 long threadCount = mbean.getThreadCount();
> 209 long expectedThreadCount = prevLiveTestThreadCount + diff;
> 210 if (threadCount < expectedThreadCount) {
>
> if some internal thread terminates, we'll get failure here
>
>
> --alex
>
> On 06/02/2020 21:00, Daniil Titov wrote:
> > Hi Alex, Serguei, and Martin,
> >
> > Thank you for your comments. Please review a new version of the fix that addresses them, specifically:
> > 1) Replaces a double loop in checkAllThreadsAlive() with a code that uses collections and containsAll() method.
> > 2) Restores the checks for other ThreadMXBean methods (getThreadCount(), getTotalStartedThreadCount(), getPeakThreadCount()) but with more relaxed conditions.
> > 3) Relaxes the check inside checkThreadIds() method
> >
> >
> > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.03/
> > [2] https://bugs.openjdk.java.net/browse/JDK-8131745
> >
> > Thank you,
> > Daniil
> >
> > On 6/1/20, 5:06 PM, "Alex Menkov" mailto:alexey.menkov at oracle.com wrote:
> >
> > Hi Daniil,
> >
> > 1. before the fix checkLiveThreads() tested
> > ThreadMXBean.getThreadCount(), but now as far as I see it tests
> > Thread.getAllStackTraces();
> >
> > 2.
> > 237 private static void checkThreadIds() throws InterruptedException {
> > 238 long[] list = mbean.getAllThreadIds();
> > 239
> > 240 waitTillEquals(
> > 241 list.length,
> > 242 ()->(long)mbean.getThreadCount(),
> > 243 "Array length returned by " +
> > 244 "getAllThreadIds() = %1$d not matched count =
> > ${provided}",
> > 245 ()->list.length
> > 246 );
> > 247 }
> >
> > I suppose purpose of waitTillEquals() is to handle creation/termination
> > of VM internal threads.
> > But if some internal thread terminates after mbean.getAllThreadIds() and
> > before 1st mbean.getThreadCount() call and then VM does not need to
> > restart it, waitTillEquals will wait forever.
> >
> > --alex
> >
> >
> > On 05/29/2020 16:28, Daniil Titov wrote:
> > > Hi Alex and Serguei,
> > >
> > > Please review a new version of the change [1] that makes sure that the test counts
> > > only the threads it creates and ignores Internal threads VM might create or destroy.
> > >
> > > Testing: Running this test in Mach5 with Graal on several hundred times ,
> > > tier1-tier3 tests are in progress.
> > >
> > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.02/
> > > [2] https://bugs.openjdk.java.net/browse/JDK-8131745
> > >
> > > Thank you,
> > > Daniil
> > >
> > > On 5/22/20, 10:26 AM, "Alex Menkov" mailto:alexey.menkov at oracle.com wrote:
> > >
> > > Hi Daniil,
> > >
> > > I'm not sure all this retry logic is a good way.
> > > As mentioned in jira the most important part of the testing is ensuring
> > > that you find all the created threads when they are alive, and you don't
> > > find them when they are dead. The actual thread count checking is not
> > > that important.
> > > I agree with this and I'd just simplify the test by removing checks for
> > > thread count. VM may create and destroy internal threads when it needs it.
> > >
> > > --alex
> > >
> > > On 05/18/2020 10:31, Daniil Titov wrote:
> > > > Please review the change [1] that fixes an intermittent failure of the test.
> > > >
> > > > This test creates and destroys a given number of daemon/user threads and validates the count of those started/stopped threads against values returned from ThreadMXBean thread counts. The problem here is that if some internal threads is started ( e.g. " HotSpotGraalManagement Bean Registration"), or destroyed (e.g. "JVMCI CompilerThread ") the test hangs waiting for expected number of live threads.
> > > >
> > > > The fix limits the time the test is waiting for desired number of live threads and in case if this limit is exceeded the test repeats itself.
> > > >
> > > > Testing. Test with Graal on and Mach5 tier1-tier7 test passed.
> > > >
> > > > [1] http://cr.openjdk.java.net/~dtitov/8131745/webrev.01
> > > > [2] https://bugs.openjdk.java.net/browse/JDK-8131745
> > > >
> > > > Thank you,
> > > > Daniil
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>
More information about the serviceability-dev
mailing list