RFR: JDK-8051349: nsk/jvmti/scenarios/sampling/SP06/sp06t003 fails in nightly
Chris Plummer
chris.plummer at oracle.com
Mon Dec 17 23:38:43 UTC 2018
On 12/17/18 2:30 PM, David Holmes wrote:
> On 18/12/2018 8:22 am, Chris Plummer wrote:
>> I think you need to set interruptReady inside the synchronized block.
>> Otherwise checkReady() can still get to the interrupt() call before
>> testedMethod() gets to the wait().
>
> That is not an issue. You can interrupt at any time independent of
> owning the monitor. The interrupt remains pending until the thread
> explicitly clears it, or an API, like wait() sees it, throws IE and
> clears it.
Ok. In that case LGTM.
Chris
>
> David
>
>> Chris
>>
>> On 12/15/18 12:24 PM, gary.adams at oracle.com wrote:
>>> Updated webrev:
>>>
>>> http://cr.openjdk.java.net/~gadams/8051349/webrev.02/index.html
>>>
>>> On 12/15/18 8:15 AM, gary.adams at oracle.com wrote:
>>>> On 12/14/18 4:43 PM, Chris Plummer wrote:
>>>>> On 12/14/18 5:28 AM, Gary Adams wrote:
>>>>>
>>>>> BTW, I forgot to ask if you can fix the typo in the following
>>>>> comment:
>>>>>
>>>>> 345 /* check frame equalaty */
>>>>
>>>> Yes, I fixed that one on Fri.
>>>> I usually save the typos for the final webrev.
>>>>
>>>>>> On 12/13/18, 11:51 AM, Chris Plummer wrote:
>>>>>>> Hi Gary,
>>>>>>>
>>>>>>> Unfortunately GetStackTrace() returns the top frame as the first
>>>>>>> frame in the array. Thus if the thread is not suspended and
>>>>>>> later you call GetFrameLocation() for some depth, there's no way
>>>>>>> to make sure the frame at that depth is the same frame at that
>>>>>>> depth in the array returned by GetStackTrace() unless you first
>>>>>>> suspend the frame before calling GetFrameLocation(). You would
>>>>>>> also need to call GetFrameCount() while suspended, compare that
>>>>>>> with the number of frames returned by GetStackTrace(), and make
>>>>>>> any needed depth adjustments. But since I think the intent is to
>>>>>>> not suspend the thread here, it probably does not make sense to
>>>>>>> do that, and instead accept that there might be some errors as
>>>>>>> you have done.
>>>>>> I'm not totally comfortable with these attempted operations and then
>>>>>> bypassing the test failure if the threads were not suspended.
>>>>>> Would it be better to only perform check thread operations on the
>>>>>> suspended threads and to totally skip the operations that can not
>>>>>> be safely performed when the thread is not suspended.
>>>>> I was starting to think along those lines. What's the point of
>>>>> testing something if you know it can fail and you can't
>>>>> distinguish between expected and unexpected failure. The other
>>>>> option is to suspend like I describe above to make sure you can
>>>>> accurately verify the stack trace. I'm just not sure if it makes
>>>>> sense to do that. Why is the test run unsuspended in the first place?
>>>> The test runs several separate threads and each is presenting
>>>> a different scenario. It probably is good to try the various functions
>>>> about stack trace, frame counts and locations on suspended and resumed
>>>> threads. It is just a bit harder to determine what a failed condition
>>>> would be for the running threads.
>>>>
>>>>>>>
>>>>>>> One improvement I would like to see for your fix is to only
>>>>>>> ignore JVMTI_ERROR_NO_MORE_FRAMES rather than ignore all
>>>>>>> failures when suspended == NSK_TRUE.
>>>>>> The current layering of the macros and verify routines do not
>>>>>> lend themselves
>>>>>> to this sort of selective error checking. I'd probably file a
>>>>>> follow up bug to
>>>>>> address specific error checking.
>>>>> Can't you just bypass the use of NSK_VERIFY? Isn't JC getting rid
>>>>> of it anyway and using his JNI exception mechanism? In that case
>>>>> you would need to use the version that does not fail if there was
>>>>> an exception.
>>>> I do not know what set of error returns would be acceptable
>>>> or how to express that in the new mechanism. That's why I suggested
>>>> addressing it in a follow up bug. It simply would narrow the range
>>>> of error returns allowed.
>>>>>>>
>>>>>>> Also, I don't think you want the suspended check here:
>>>>>>>
>>>>>>> 348 /* check if expected method frame found */
>>>>>>> 349 if ((suspended == NSK_TRUE) && (found <= 0)) {
>>>>>>>
>>>>>>> The check for finding the method is:
>>>>>>>
>>>>>>> 341 if (frameStack[j].method ==
>>>>>>> threadsDesc[i].method) {
>>>>>>>
>>>>>>> Since frameStack[] is returned by GetStackTrace(), it is not
>>>>>>> impacted when not suspended, and the expected method should
>>>>>>> always be in frameStack[] somewhere. The issue is only with
>>>>>>> using GetFrameLocation() to correlate what is in the result of
>>>>>>> GetStackTrace().
>>>>>> Agreed. The early continuation was bypassing the found method
>>>>>> checking.
>>>>>>
>>>>>> BTW, another failure has been detected in sp06t001. This time the
>>>>>> threads are suspended,
>>>>>> but I believe there is a race between thread start and the call
>>>>>> to interrupt() the thread.
>>>>>> I think there may be some confusion over which thread is invoking
>>>>>> the interrupt() call.
>>>>>> It is running on the main thread from the code after a call to
>>>>>> start the thread, but the
>>>>>> thread may not have run when the interrupt is requested.
>>>>>>
>>>>>> public class sp06t001 extends DebugeeClass {
>>>>>> ...
>>>>>>
>>>>>> // create threads list
>>>>>> threads = new sp06t001Thread[] {
>>>>>> new sp06t001ThreadRunning("threadRunning", log),
>>>>>> new sp06t001ThreadEntering("threadEntering", log),
>>>>>> new sp06t001ThreadWaiting("threadWaiting", log),
>>>>>> new sp06t001ThreadSleeping("threadSleeping", log),
>>>>>> new
>>>>>> sp06t001ThreadRunningInterrupted("threadRunningInterrupted", log),
>>>>>> new
>>>>>> sp06t001ThreadRunningNative("threadRunningNative", log)
>>>>>> };
>>>>>>
>>>>>> // run threads
>>>>>> log.display("Starting tested threads");
>>>>>> try {
>>>>>> synchronized (endingMonitor) {
>>>>>> // start threads (except first one)
>>>>>> for (int i = 0; i < threads.length; i++) {
>>>>>> threads[i].start();
>>>>>> if (!threads[i].checkReady()) {
>>>>>> throw new Failure("Unable to prepare
>>>>>> thread ..."
>>>>>> ...
>>>>>>
>>>>>> class sp06t001ThreadRunningInterrupted extends sp06t001Thread {
>>>>>> ...
>>>>>> public boolean checkReady() {
>>>>>> // interrupt thread on wait()
>>>>>> synchronized (waitingMonitor) {
>>>>>> interrupt();
>>>>>> }
>>>>>> return true;
>>>>>> }
>>>>> Certainly checkReady() could be called before the target thread
>>>>> has started the wait(). Then the thread does the wait() and
>>>>> probably never exits (or maybe the early interrupted is resulting
>>>>> in some other unexpected behavior).
>>>> If we get to the wait then the framecount would be fine.
>>>>
>>>> The debuggee class launches 6 threads. The agent is synced with the
>>>> debuggee
>>>> starting. I don't see the native agent synchronizing on the 6
>>>> started threads,
>>>> when the threads are suspended and checked.
>>>>
>>>> As far as I can tell the call to interrupted() just sets a flag and
>>>> it is up to the thread
>>>> itself to decide how to interpret the isInterrupted() state.
>>>>>
>>>>> I think you need synchronize between the two threads. What if the
>>>>> target thread does a synchronized (waitingMonitor), sets a flag,
>>>>> and then a wait(). The main thread sleeps until the flag is set,
>>>>> does a synchronized(waitingMonitor), and then the interrupt().
>>>> Since the test is already using a waiting monitor, I do not want to
>>>> disturb how it is being used in setting up the interrupted thread
>>>> test case.
>>>> The test is also using a "volatile boolean threadReady" for signalling
>>>> conditions in the test.
>>>>
>>>> On Fri I was experimenting with another boolean flag to delay the
>>>> checkReady until the testMethod was at the waiting monitor,
>>>> e.g. ready to be interrupted. It seems to work but I was waiting
>>>> for enough testruns before posting a revised webrev.
>>>>
>>>>>
>>>>> Chris
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 12/13/18 5:25 AM, Gary Adams wrote:
>>>>>>>> While testing I ran into another of the related failures that were
>>>>>>>> associated with the original bug.
>>>>>>>> The following fake exception stacktrace is for failure analysis.
>>>>>>>> nsk.share.Fake_Exception_for_RULE_Creation: (sp02t003.cpp:313)
>>>>>>>> jvmti->GetFrameLocation(threadsDesc[i].thread, j, &qMethod,
>>>>>>>> &qLocation)
>>>>>>>> at nsk_lvcomplain(nsk_tools.cpp:172)
>>>>>>>> # ERROR: sp02t003.cpp, 313:
>>>>>>>> jvmti->GetFrameLocation(threadsDesc[i].thread, j, &qMethod,
>>>>>>>> &qLocation)
>>>>>>>> # jvmti error: code=31, name=JVMTI_ERROR_NO_MORE_FRAMES
>>>>>>>> - sp02t003.cpp, 310: 3 frame: method: 0x7f225042b2d8,
>>>>>>>> location: 16
>>>>>>>> # ERROR: sp02t003.cpp, 313:
>>>>>>>> jvmti->GetFrameLocation(threadsDesc[i].thread, j, &qMethod,
>>>>>>>> &qLocation)
>>>>>>>> # jvmti error: code=31, name=JVMTI_ERROR_NO_MORE_FRAMES
>>>>>>>> # ERROR: sp02t003.cpp, 350: No expected method frame for not
>>>>>>>> suspended thread #4 (threadRunningInterrupted)
>>>>>>>> In this particular failure, the GetFrameLocation call failed,
>>>>>>>> because the frame was no longer
>>>>>>>> accessible.
>>>>>>>>
>>>>>>>> If the threads are not suspended, the GetFrameLocation may fail,
>>>>>>>> or if it passes, the qMethod and qLocation could belong to another
>>>>>>>> frame.
>>>>>>>>
>>>>>>>> I've updated the webrev to allow the call to GetFrameLocation,
>>>>>>>> but to only
>>>>>>>> report an error if the threads are suspended. Similarly, the
>>>>>>>> checks to compare
>>>>>>>> qMethod and qLocation will be skipped, if the threads are not
>>>>>>>> suspended.
>>>>>>>> And the final comparison that the method was found will only be
>>>>>>>> an error
>>>>>>>> if the threads are suspended.
>>>>>>>>
>>>>>>>> Webrev: http://cr.openjdk.java.net/~gadams/8051349/webrev.01/
>>>>>>>>
>>>>>>>> On 12/12/18, 11:54 AM, Gary Adams wrote:
>>>>>>>>> After some testing with nsk verbose messages enabled,
>>>>>>>>> it is clear that this test is failing in checkThreads when the
>>>>>>>>> location did not match between the call to GetStackTrace
>>>>>>>>> and GetFrameLocation. For the tests that are run when the threads
>>>>>>>>> have not been suspended, there is no guarantee the locations
>>>>>>>>> will match.
>>>>>>>>>
>>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8051349
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~gadams/8051349/webrev.00/
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
More information about the serviceability-dev
mailing list