RFR: 6815126 intermittent SimulResumerTest.java failure
shanliang
shanliang.jiang at oracle.com
Wed Apr 2 08:01:58 UTC 2014
Hope to get reviewed and to push this fix:
1) this is a fix for a bug labeled with "svc-nightly"
2) The current test must be useful. Yes the test could not be 100% sure
to test the bug JDK-6751643, but with its 2*10000 resume repeatings it
would have big chance to hit the bug conditions, the failure the patch
to fix happened exactly in the condition the bug JDK-6751643 could happen.
3) there is possibly someway to realize the synchronization logic
between the thread invoking the operations and the thread resuming, I
could see to add code into the method "resume" to do waiting for this
test, but I could not see an easy and practical way to do that.
4) we can create a new bug to fix this synchronization issue if necessary.
Thanks,
Shanliang
shanliang wrote:
> Jaroslav Bachorik wrote:
>> Thanks Shanliang, it is clear now.
>>
>> The patch will get rid off the IOOBE but I have my doubts about what
>> the test actually tests. It is supposed to make sure that certain
>> operations will not throw NPE when the debugged thread is resumed
>> (from a concurrent debugger thread) before the operation has managed
>> to finish. However, there seems to be no synchronization logic
>> between the thread invoking the operations and the thread resuming
>> the paused debugged thread, relying only on hitting this condition by
>> chance.
>>
>> This test seems to be a good candidate for a thorough revision/rewrite.
> Not sure how to make the checking happen during a "resuming" window,
> the test creates 2 threads and each repeats "resume"10000 times, and
> one another thread repeats checking with 100ms sleeping time, just
> hoping some checking would fail into a resuming window.
>
> Shanliang
>>
>> -JB-
>>
>> On 31.3.2014 11:26, shanliang wrote:
>>> Erik Gahlin wrote:
>>>> I also like to understand better.
>>> Possibly my previous reply was not clear enough or I missed something
>>> there.
>>>
>>> The test was to test JDK-6751643 as I cited in the last mail, here is
>>> the info from JDK-6751643 to which this test was developed:
>>> ------
>>> This bug can only occur if a debugger has multiple threads and calls
>>> any
>>> of the following methods in one thread while simultaneously resuming
>>> the
>>> same debuggee thread in a different debugger thread. Debuggers
>>> shouldn't
>>> do this because it is a race condition and the result returned by these
>>> methods will vary depending upon just where in the processing of these
>>> methods the resume takes effect. EG, the frameCount() method could
>>> return 6 in a case where the debuggee has already been resumed and
>>> there
>>> are no frames.
>>> ------
>>>
>>> To reproduce the bug, test did mainly 2 things by different threads:
>>> 1) received vm events and resumed vm, this was done by thread
>>> "Thread-1"
>>> in the class TestScaffold which registered a listener and called the
>>> following method:
>>> /**
>>> * Events handled directly by scaffold always resume (well, almost
>>> always)
>>> */
>>> public void eventSetComplete(EventSet set) {
>>> // The listener in connect(..) resumes after receiving our
>>> // special VMDeathEvent. We can't also do the resume
>>> // here or we will probably get a VMDisconnectedException
>>> if (!containsOurVMDeathRequest(set)) {
>>> traceln("TS: set.resume() called");
>>> set.resume();
>>> }
>>> }
>>>
>>> 2) called the method "check" in the class SimulResumerTarg, to see
>>> whether a NullPointerException was thrown, the thread name was "test
>>> resumer" (better to named as "checking thread"?)
>>>
>>> So one thread was doing resume, another thread was doing check at same.
>>> I added the code to see the different values of frames.size() at
>>> line 185:
>>> for (i=0; i<10:i++) {
>>> System.out.println("---frames.size(): "+frames.size());
>>> Thhread.sleep(200);
>>> }
>>>
>>> if printing out frames, sometime we could see one more frame:
>>> ------------------ java.lang.Thread.yield()+-1 in thread
>>> instance of
>>> SimulResumerTarg(name='Thread 2', id=109)
>>>
>>>
>>> Shanliang
>>>>
>>>> I looked at this failure before and I couldn't see what was wrong, not
>>>> in the test or product.
>>>>
>>>> Erik
>>>>
>>>> Jaroslav Bachorik skrev 3/27/14 4:49 PM:
>>>>> On 27.3.2014 15:49, shanliang wrote:
>>>>>> Hi,
>>>>>>
>>>>>> The call
>>>>>> thr.frames(0, frames.size() - 1);
>>>>>> suffers a synchronization issue, the size may be changed after
>>>>>> frames.size() returns.
>>>>>
>>>>> Any idea why there is a synchronization issue? The code seems to be
>>>>> intended to run only when a breakpoint is hit and the target thread
>>>>> is suspended.
>>>>>
>>>>> -JB-
>>>>>
>>>>>>
>>>>>> webrev:
>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-6815126/00/
>>>>>>
>>>>>> bug:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-6815126
>>>>>>
>>>>>> Shanliang
>>>>>
>>>>
>>>
>>
>
More information about the serviceability-dev
mailing list