RFR(S): 8211932: [ppc][testbug] runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads don't terminate immediately
David Holmes
david.holmes at oracle.com
Wed Oct 10 12:31:47 UTC 2018
Hi Goetz,
On 10/10/2018 8:25 PM, Lindenmaier, Goetz wrote:
> Hi David,
>
> This failure is very well reproducible, but only on linuxppc64 and linuxppc64le.
That doesn't really make sense to me. I would not expect the
process/thread lifecycle management code to be different based on the
CPU involved. This should be a simple kernel + NPTL/libc issue.
> I implemented this fix in July, just missed the RDP, and the patch is used
> in our nightly builds since then. Since that date I don't see a single
> failure. We run these nightly tests with the fastdebug build, though.
> But linuxx86_64, linuxs390x don't show the issue, nor all the other
> platforms. As there is no special high load, and because it's that
> well reproducible, I don't think I read the information of a thread of another
> process with the same thread id.
> With the output I implemented in the test, I see that the cpu time keeps
> increasing a bit, then it's stable for a few iterations, and then -1.
That can also be explained by a thread-id being recycled and then the
new thread also terminating. Granted the timing and reproducibility
makes that unlikely.
This is quite bizarre and I don't like bizarre. :)
Are you able to apply this patch to the test and run some tests on ppc?
if ((res = pthread_join(thread, NULL)) != 0) {
fprintf(stderr, "TEST ERROR: pthread_join failed: %s (%d)\n",
strerror(res), res);
exit(1);
}
+ while (pthread_kill(thread, 0) == 0) {
+ res++;
+ }
+ printf("Native thread was gone after %d iterations\n", res);
return nativeThread;
}
Once pthread_kill gives ESRCH then so should pthread_get_cpuclockid().
At least until the thread-id is recycled.
Thanks,
David
> Best regards,
> Goetz.
>
>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Mittwoch, 10. Oktober 2018 01:22
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; hotspot-runtime-
>> dev at openjdk.java.net
>> Subject: Re: RFR(S): 8211932: [ppc][testbug]
>> runtime/jni/terminatedThread/TestTerminatedThread.java fails as threads
>> don't terminate immediately
>>
>> Hi Goetz,
>>
>> There is already an open bug for this issue - JDK-8208159 - but it has
>> only reproduced in a stress environment where we think thread-id's are
>> being recycled (which means waiting longer won't help). This should be
>> OS not CPU specific so I'm very interested to know in what circumstances
>> you see this failure.
>>
>> I created an instrumented version of the test that did a pthread_kill on
>> the target to check for ESRCH - which it got - yet we still see failures
>> in those stress environments.
>>
>> David
>>
>> On 10/10/2018 1:10 AM, Lindenmaier, Goetz wrote:
>>> Hi,
>>>
>>> On ppc, one still sees increasing thread cpu times after a thread has joined.
>>> This makes TestTerminatedThread fail.
>>>
>>> This change gives the check a few seconds to wait until the thread
>> disappears.
>>> Please review.
>>> http://cr.openjdk.java.net/~goetz/wr18/8211931-
>> terminatedThrd/01/test/hotspot/jtreg/runtime/jni/terminatedThread/TestT
>> erminatedThread.java.udiff.html
>>>
>>> Best regards,
>>> Goetz.
>>>
More information about the hotspot-runtime-dev
mailing list