RFR(S): 8247762: [aarch64] Timeout in .../HeapDumpTestWithActiveProcess.java due to inf. loop in AARCH64CurrentFrameGuess.run()

Patric Hedlin patric.hedlin at oracle.com
Mon Jul 6 19:54:40 UTC 2020


Thanks Chris, for review and laying out the text.

Andrew,

Something you can live with?

/Patric

On 2020-07-06 17:52, Chris Plummer wrote:
> On 7/6/20 1:37 AM, Andrew Haley wrote:
>> On 05/07/2020 16:26, Patric Hedlin wrote:
>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8247762
>>> Webrev: http://cr.openjdk.java.net/~phedlin/tr8247762/
>>>
>>>
>>> AARCH64CurrentFrameGuess.run() may loop indefinitely in a bad
>>> stack-walk. This is JDK-8231635 applied to AArch64.
>>   141               Frame oldFrame = frame;
>>   142               frame = frame.sender(map);
>>   143               if 
>> (frame.getSP().lessThanOrEqual(oldFrame.getSP())) {
>>   144                 // Frame points to itself or to a location in 
>> the wrong direction.
>>   145                 // Break the loop and move on to next offset.
>>   146                 if (DEBUG) {
>>   147                   System.out.println("CurrentFrameGuess: frame 
>> <= oldFrame: " + frame);
>>   148                 }
>>   149                 break;
>>   150               }
>>   151             }
>>
>> OK, that looks like a reasonable thing to do, but I would wonder how 
>> the stack got
>> into that mess.
>>
> Hi Patric,
>
> The changes look good to me.
>
> Andrew,
>
> The problem is not the stack per se. AARCH64CurrentFrameGuess.run() 
> tries to find the "current frame". It starts with the specified SP 
> (which I believe comes from the SP register), and validates that it 
> represents the current frame by using it to walk the stack until the 
> first entry frame is found. If it doesn't find it, then it increments 
> SP by a word and tries again. It does this until it either can 
> successfully walk to the first entry frame, or SP leaves the range it 
> is willing to search, at which point it gives up. During this search 
> all manner of bad addresses can be accessed. This is why there is an 
> exception handler that when triggered simply moves on to the next SP 
> to check. So it's not at all surprising that on occasion a bad SP 
> results in frame->sender() pointing to a frame that was already visited.
>
> thanks,
>
> Chris
>



More information about the serviceability-dev mailing list