RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code
Chris Plummer
chris.plummer at oracle.com
Thu Jul 5 23:00:39 UTC 2018
Hi David,
Looks good. Regarding the test being in a package, looks like this was
the convention for the nsk tests, so that's why I noted it.
thanks,
Chris
On 7/5/18 3:40 PM, David Holmes wrote:
> Hi Chris,
>
> Thanks for looking at this.
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>
> Only real changes in ji05t001.c. (And fixed typo in the new test)
>
> More below ...
>
> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>> Hi David,
>>
>> Solaris problems aside, overall it looks fine. Some minor things I
>> noted:
>>
>> I noticed that exitCode is never modified in agentA() or agentB(), so
>> there isn't much point to having it. If you reach the bottom of the
>> function, it passed, so PASSED can be returned. The code would be
>> more clear if it did this. As-is it is implied that you can reach the
>> bottom when it fails.
>
> I resisted any and all urges to do any kind of unrelated code cleanup
> in the tests - once you start you may end up doing a full rewrite.
>
>> Is detaching the threads along the failure paths really needed?
>> exit() is called, so this would seem to make it unnecessary.
>
> You're right that isn't necessary. I'll remove the changes from before
> the exits in ji05t001.c
>
>> I prefer assignments not to be embedded inside the "if" condition.
>> The DetachCurrentThread code in THREAD_return() is much more readable
>> than the similar code in agentA() and agentB().
>
> It's an existing style already used in that test e.g.
>
> 287 if ((res =
> 288 JNI_ENV_PTR(vm)->AttachCurrentThread(
> 289 JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) !=
> 0) {
>
> and I don't mind it, so I'd prefer not to change it.
>
>> In the test:
>>
>> 54 // Generally as long as we don't crash of throw unexpected
>> 55 // exceptions then the test passes. In some cases we
>> know exactly
>>
>> "of" should be "or".
>
> Well spotted. Thanks.
>
>> Shouldn't you be catching exceptions for all the Thread methods you
>> are calling? Otherwise the test will exit if one is thrown, and the
>> above comment indicates that you don't want this.
>
> I'm not expecting there to be any exceptions from any of the called
> methods. That would potentially indicate a problem in handling the
> terminated native thread, so would indicate a test failure.
>
>> Don't we normally put these tests in a package?
>
> Doesn't seem to be any hard and fast rule. I only uses packages when
> they are important for the test. In runtime we have 905 java files and
> only 116 have a package statement. It varies elsewhere.
>
> Thanks,
> David
>
>> thanks,
>>
>> Chris
>>
>> On 7/5/18 2:58 AM, David Holmes wrote:
>>> <sigh> Solaris compiler complains about doing a return from inside a
>>> do-while loop. I'll have to rework part of the fix tomorrow.
>>>
>>> David
>>>
>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>
>>>> Problem:
>>>>
>>>> The tests create native threads that attach to the VM through
>>>> JNI_AttachCurrentThread but which then terminate without detaching
>>>> themselves. When the VM exits and we're using Flight Recorder
>>>> "dumponexit" this leads to a call to VM_PrintThreads that in part
>>>> wants to print the per-thread CPU usage. When we encounter the
>>>> threads that have terminated already the low level
>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't
>>>> expect that and so fails an assert in debug mode and can SEGV in
>>>> product mode.
>>>>
>>>> Solution:
>>>>
>>>> Serviceability-side: fix the tests
>>>>
>>>> Change the tests so that the threads detach before terminating. The
>>>> two tests are (surprisingly) written in completely different
>>>> styles, so the solution also takes on two different styles.
>>>>
>>>> Runtime-side: make the VM more robust in the fact of JNI attached
>>>> threads that terminate before detaching, and add a regression test
>>>>
>>>> I took a good look at the low-level code for interacting with
>>>> arbitrary threads and as far as I can see the problem only exists
>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the
>>>> potential for a library call failure just reports an error value
>>>> (such as -1 for the cpu time used).
>>>>
>>>> So the fix is simply to allow for ESRCH when calling
>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>>>
>>>> I created a new regression test to create a new native thread,
>>>> attach it and then let it terminate while still attached. The java
>>>> code then calls various Thread and ThreadMXBean functions on it to
>>>> ensure there are no crashes or unexpected exceptions.
>>>>
>>>> Testing:
>>>> - old tests with fixed run-time
>>>> - old run-time with fixed tests
>>>> - mach tier4 (which exposed the problem - that's where we enable
>>>> Flight recorder for the tests) [in progress]
>>>> - mach5 tier 1-3 for good measure [in progress]
>>>> - new regression test
>>>>
>>>> Thanks,
>>>> David
>>
>>
>>
More information about the serviceability-dev
mailing list