RFR: JDK-8051349: nsk/jvmti/scenarios/sampling/SP06/sp06t003 fails in nightly

Sat Dec 15 20:24:03 UTC 2018

Updated webrev:

   http://cr.openjdk.java.net/~gadams/8051349/webrev.02/index.html

On 12/15/18 8:15 AM, gary.adams at oracle.com wrote:
> On 12/14/18 4:43 PM, Chris Plummer wrote:
>> On 12/14/18 5:28 AM, Gary Adams wrote:
>>
>> BTW, I forgot to ask if you can fix the typo in the following comment:
>>
>>  345             /* check frame equalaty */
>
> Yes, I fixed that one on Fri.
> I usually save the typos for the final webrev.
>
>>> On 12/13/18, 11:51 AM, Chris Plummer wrote:
>>>> Hi Gary,
>>>>
>>>> Unfortunately GetStackTrace() returns the top frame as the first 
>>>> frame in the array. Thus if the thread is not suspended and later 
>>>> you call GetFrameLocation() for some depth, there's no way to make 
>>>> sure the frame at that depth is the same frame at that depth in the 
>>>> array returned by GetStackTrace() unless you first suspend the 
>>>> frame before calling GetFrameLocation(). You would also need to 
>>>> call GetFrameCount() while suspended, compare that with the number 
>>>> of frames returned by GetStackTrace(), and make any needed depth 
>>>> adjustments. But since I think the intent is to not suspend the 
>>>> thread here, it probably does not make sense to do that, and 
>>>> instead accept that there might be some errors as you have done.
>>> I'm not totally comfortable with these attempted operations and then
>>> bypassing the test failure if the threads were not suspended.
>>> Would it be better to only perform check  thread operations on the
>>> suspended threads and to totally skip the operations that can not
>>> be safely performed when the thread is not suspended.
>> I was starting to think along those lines. What's the point of 
>> testing something if you know it can fail and you can't distinguish 
>> between expected and unexpected failure. The other option is to 
>> suspend like I describe above to make sure you can accurately verify 
>> the stack trace. I'm just not sure if it makes sense to do that. Why 
>> is the test run unsuspended in the first place?
> The test runs several separate threads and each is presenting
> a different scenario. It probably is good to try the various functions
> about stack trace, frame counts and locations on suspended and resumed
> threads. It is just a bit harder to determine what a failed condition
> would be for the running threads.
>
>>>>
>>>> One improvement I would like to see for your fix is to only ignore 
>>>> JVMTI_ERROR_NO_MORE_FRAMES rather than ignore all failures when 
>>>> suspended == NSK_TRUE.
>>> The current layering of the macros and verify routines do not lend 
>>> themselves
>>> to this sort of selective error checking. I'd probably file a follow 
>>> up bug to
>>> address specific error checking.
>> Can't you just bypass the use of NSK_VERIFY? Isn't JC getting rid of 
>> it anyway and using his JNI exception mechanism? In that case you 
>> would need to use the version that does not fail if there was an 
>> exception.
> I do not know what set of error returns would be acceptable
> or how to express that in the new mechanism. That's why I suggested
> addressing it in a follow up bug. It simply would narrow the range
> of error returns allowed.
>>>>
>>>> Also, I don't think you want the suspended check here:
>>>>
>>>>  348         /* check if expected method frame found */
>>>>  349         if ((suspended == NSK_TRUE) && (found <= 0)) {
>>>>
>>>> The check for finding the method is:
>>>>
>>>>  341             if (frameStack[j].method == threadsDesc[i].method) {
>>>>
>>>> Since frameStack[] is returned by GetStackTrace(), it is not 
>>>> impacted when not suspended, and the expected method should always 
>>>> be in frameStack[] somewhere. The issue is only with using 
>>>> GetFrameLocation() to correlate what is in the result of 
>>>> GetStackTrace().
>>> Agreed. The early continuation was bypassing the found method checking.
>>>
>>> BTW, another failure has been detected in sp06t001. This time the 
>>> threads are suspended,
>>> but I believe there is a race between thread start and the call to 
>>> interrupt() the thread.
>>> I think there may be some confusion over which thread is invoking 
>>> the interrupt() call.
>>> It is running on the main thread from the code after a call to start 
>>> the thread, but the
>>> thread may not have run when the interrupt is requested.
>>>
>>> public class sp06t001 extends DebugeeClass {
>>> ...
>>>
>>>         // create threads list
>>>         threads = new sp06t001Thread[] {
>>>             new sp06t001ThreadRunning("threadRunning", log),
>>>             new sp06t001ThreadEntering("threadEntering", log),
>>>             new sp06t001ThreadWaiting("threadWaiting", log),
>>>             new sp06t001ThreadSleeping("threadSleeping", log),
>>>             new 
>>> sp06t001ThreadRunningInterrupted("threadRunningInterrupted", log),
>>>             new sp06t001ThreadRunningNative("threadRunningNative", log)
>>>         };
>>>
>>>         // run threads
>>>         log.display("Starting tested threads");
>>>         try {
>>>             synchronized (endingMonitor) {
>>>                 // start threads (except first one)
>>>                 for (int i = 0; i < threads.length; i++) {
>>>                     threads[i].start();
>>>                     if (!threads[i].checkReady()) {
>>>                         throw new Failure("Unable to prepare thread ..."
>>> ...
>>>
>>> class sp06t001ThreadRunningInterrupted extends sp06t001Thread {
>>> ...
>>>     public boolean checkReady() {
>>>         // interrupt thread on wait()
>>>         synchronized (waitingMonitor) {
>>>             interrupt();
>>>         }
>>>         return true;
>>>     }
>> Certainly checkReady() could be called before the target thread has 
>> started the wait(). Then the thread does the wait() and probably 
>> never exits (or maybe the early interrupted is resulting in some 
>> other unexpected behavior).
> If we get to the wait then the framecount would be fine.
>
> The debuggee class launches 6 threads. The agent is synced with the 
> debuggee
> starting. I don't see the native agent synchronizing on the 6 started 
> threads,
> when the threads are suspended and checked.
>
> As far as I can tell the call to interrupted() just sets a flag and it 
> is up to the thread
> itself to decide how to interpret the isInterrupted() state.
>>
>> I think you need synchronize between the two threads. What if the 
>> target thread does a synchronized (waitingMonitor), sets a flag, and 
>> then a wait(). The main thread sleeps until the flag is set, does a 
>> synchronized(waitingMonitor), and then the interrupt().
> Since the test is already using a waiting monitor, I do not want to
> disturb how it is being used in setting up the interrupted thread test 
> case.
> The test is also using a "volatile boolean threadReady" for signalling
> conditions in the test.
>
> On Fri I was experimenting with another boolean flag to delay the
> checkReady until the testMethod was at the waiting monitor,
> e.g. ready to be interrupted. It seems to work but I was waiting
> for enough testruns before posting a revised webrev.
>
>>
>> Chris
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 12/13/18 5:25 AM, Gary Adams wrote:
>>>>> While testing I ran into another of the related failures that were
>>>>> associated with the original bug.
>>>>> The following fake exception stacktrace is for failure analysis.
>>>>> nsk.share.Fake_Exception_for_RULE_Creation: (sp02t003.cpp:313) jvmti->GetFrameLocation(threadsDesc[i].thread, j, &qMethod, &qLocation)
>>>>> 	at nsk_lvcomplain(nsk_tools.cpp:172)
>>>>> # ERROR: sp02t003.cpp, 313: jvmti->GetFrameLocation(threadsDesc[i].thread, j, &qMethod, &qLocation)
>>>>> #   jvmti error: code=31, name=JVMTI_ERROR_NO_MORE_FRAMES
>>>>> - sp02t003.cpp, 310:       3 frame: method: 0x7f225042b2d8, location: 16
>>>>> # ERROR: sp02t003.cpp, 313: jvmti->GetFrameLocation(threadsDesc[i].thread, j, &qMethod, &qLocation)
>>>>> #   jvmti error: code=31, name=JVMTI_ERROR_NO_MORE_FRAMES
>>>>> # ERROR: sp02t003.cpp, 350: No expected method frame for not suspended thread #4 (threadRunningInterrupted)
>>>>> In this particular failure, the GetFrameLocation call failed, 
>>>>> because the frame was no longer
>>>>> accessible.
>>>>>
>>>>> If the threads are not suspended, the GetFrameLocation may fail,
>>>>> or if it passes, the qMethod and qLocation could belong to another
>>>>> frame.
>>>>>
>>>>> I've updated the webrev to allow the call to GetFrameLocation, but 
>>>>> to only
>>>>> report an error if the threads are suspended. Similarly, the 
>>>>> checks to compare
>>>>> qMethod and qLocation will be skipped, if the threads are not 
>>>>> suspended.
>>>>> And the final comparison that the method was found will only be an 
>>>>> error
>>>>> if the threads are suspended.
>>>>>
>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8051349/webrev.01/
>>>>>
>>>>> On 12/12/18, 11:54 AM, Gary Adams wrote:
>>>>>> After some testing with nsk verbose messages enabled,
>>>>>> it is clear that this test is failing in checkThreads when the
>>>>>> location did not match between the call to GetStackTrace
>>>>>> and GetFrameLocation. For the tests that are run when the threads
>>>>>> have not been suspended, there is no guarantee the locations
>>>>>> will match.
>>>>>>
>>>>>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8051349
>>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8051349/webrev.00/
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20181215/18b46b5f/attachment-0001.html>