RFR(L): 8032410: compiler/uncommontrap/TestStackBangRbp.java times out on Solaris-Sparc V9
Roland Westrelin
roland.westrelin at oracle.com
Mon Apr 14 13:27:17 UTC 2014
Here is a new webrev that implements Vladimir’s suggestion (use max stack in interpreter frame size computation):
http://cr.openjdk.java.net/~roland/8032410/webrev.04/
The diff from the previous webrev:
http://cr.openjdk.java.net/~roland/8032410/webrev.03-04/
Roland.
On Apr 10, 2014, at 10:26 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
> Hi Vladimir,
>
>>>> I tried to tell about this during review ;) I thought using (<=StackShadowPages) from 8026775 changes should touch it.
>>>
>>> Sorry Vladimir. I didn’t realize there could be a problem even before the end of the stack is reached. I should have read 8026775 more carefully.
>>>
>>>> Can you spend sometime and write down in bug report about all places where we do stack bang and how much pages we bang so we can see whole picture?
>>>>
>>>> I think we should bang all sequential pages and do the same in all places. Banging StackShadowPages or StackShadowPages+1 is secondary if we do the same in all places.
>>>
>>> Ready for some headaches?
>>>
>>> The interpreter: allocates the current frame and then stack bangs all pages at sp+1*page … sp+StackShadowPages*page included
>>> The compiler bangs (before my change): sp + StackShadowPages*page and the next frame_size/page_size pages
>>> In the deopt blob we bang: sp (once the compiled frame is popped) +1 page … sp+(StackShadowPages+1)*page and the next frame_size/page_size pages
>>>
>>> I talked with Mikael and the reason we bang up to (StackShadowPages+1)*page in the deopt blob is because in the interpreter, banging happens once the frame is set up. So banging up to StackShadowPages*page in the deopt blob with no frame pushed doesn’t bang as far as the interpreter would.
>>
>> So far I am following :)
>>
>>>
>>> Let’s take an example with my change (no banging in the deopt blob) and if the compiler bangs at sp+StackShadowPages*page. I think something like this is possible:
>>> Let’s say StackShadowPages=2
>>>
>>> 1) SP points in page P. We enter an interpreted frame. The frame is allocated. SP is still in P. The interpreter bangs P+1 and P+2.
>>
>> Do you mean when the frame is small we stay on the same page after the frame is allocated? Okay.
>
> Yes.
>
>>
>>> 2) The interpreter calls a compiled method. The compiled method is entered with SP still in P but right before the boundary with the next page. The compiler bangs P+2.
>>
>> Could you remind me about your change? Does compiled code bangs all range from min(interpr_frame_size, comp_frame_size) to max(interpr_frame_size, comp_frame_size) plus StackShadowPages? Or only (max + StackShadowPages)?
>
> It bangs pages at sp + StackShadowPages*page_size and the next max(interpr_frame_size, comp_frame_size)/page_size pages if any. Before my change, the compiled code banged at sp + StackShadowPages*page_size and the next comp_frame_size/page_size pages if any
>
>>
>>> 3) We deoptimize. We pop the frame. SP is in P right before the page boundary. The method has a lot of locals and the interpreter frame size is just below 1 page. After deoptimization SP is in P+1 right before the boundary.
>>
>> So SP is the same as on entry to the compiled method after the frame pop?
>
> Yes.
>
>> Do we touch all stack slots when we reconstruct Interpreter frame during deoptimization? Asking for a case when last slots are in next page and we don't touch it.
>
> I assume we do.
>
>>
>>> 4) We’re at a call, push some arguments and SP moves to P+2 and we call a compiled method. The compiled method bang P+4. P+3 was never touched.
>>
>> Method's max_stack should take into account the space for output arguments. It need to be taking into account when we bang in compiled code. In 2) compiled code should have bang p+2 and p+3.
>
> Ok. That would work indeed.
>
>>>
>>> Had the compiler banged at P+3 (StackShadowPages+1) in 2), there would be no problem in that example. But then another example with my change and if the compiler bangs at sp+(StackShadowPages+1)*page. Let’s say StackShadowPages=2.
>>>
>>> 1) SP points in page P. We enter an interpreted frame. The frame is allocated. SP is still in P but right before the page boundary. The interpreter bangs P+1 and P+2.
>>> 2) The interpreter pushes some arguments and we are now in P+1 and calls a compiled method. The compiler bangs P+4. P+3 was never touched.
>>>
>>> So that doesn’t work either.
>>>
>>> Wishing we had a whiteboard again? ;-)
>>
>> Yes and yes!
>>
>>>
>>> Maybe the solution is for the compiler to bang at sp + StackShadowPages*page + (interpreter_frame_size % page) and the next interpreter_frame_size/page_size pages. That would mimic what the interpreter does and would work in both examples, above I think. interpreter_frame_size would have to not include what’s on the expression stack of the top frame to be as close as possible to the interpreter behavior.
>>
>> I don't see how you can determine "next interpreter_frame_size/page_size pages"
>>
>> As I said before if compiled code takes into account max stack size then first solution should work, I think.
>
> Let me try that. Thanks!
>
> Roland.
>
>>
>> Thanks,
>> Vladimir
>>
>>
>>>
>>> Roland.
>>>
>>>
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 4/2/14 1:46 AM, Roland Westrelin wrote:
>>>>>> The question is why you got EXCEPTION_ACCESS_VIOLATION for normal stack bang? May be it is 8026775 again when one page is skipped during banging. Windows requires sequential pages touche.
>>>>>
>>>>> I wasn’t aware of this requirement on windows. Thanks, Vladimir.
>>>>> The interpreter bangs up to and including sp + StackShadowPages while the compiled code, with this change, bangs at sp + StackShadowPages + 1. So a page can be skipped and the requirement that all pages be touched sequentially cannot be guaranteed. So we either have to go back to banging at sp + StackShadowPages for the compiled code or enable the code that I pointed to in the signal on 32 bit. What do you think?
>>>>>
>>>>> Roland.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 4/1/14 8:00 AM, Roland Westrelin wrote:
>>>>>>> I tried to push that change and couldn’t because of a crash on windows 32 bit. The VM crashes at a stack banging instruction in compiled code but the sp looks to be perfectly valid (not in the yellow zone or red zone, within the stack bounds). I noticed this code in the windows signal handler:
>>>>>>>
>>>>>>> #ifdef _WIN64
>>>>>>> //
>>>>>>> // If it's a legal stack address map the entire region in
>>>>>>> //
>>>>>>> PEXCEPTION_RECORD exceptionRecord = exceptionInfo->ExceptionRecord;
>>>>>>> address addr = (address) exceptionRecord->ExceptionInformation[1];
>>>>>>> if (addr > thread->stack_yellow_zone_base() && addr < thread->stack_base() ) {
>>>>>>> addr = (address)((uintptr_t)addr &
>>>>>>> (~((uintptr_t)os::vm_page_size() - (uintptr_t)1)));
>>>>>>> os::commit_memory((char *)addr, thread->stack_base() - addr,
>>>>>>> !ExecMem);
>>>>>>> return EXCEPTION_CONTINUE_EXECUTION;
>>>>>>> }
>>>>>>> else
>>>>>>> #endif
>>>>>>>
>>>>>>> If I enable it on 32 bit, the jprt tests pass. Does anybody know why this is needed? Why this is WIN64 only?
>>>>>>>
>>>>>>> Roland.
More information about the ppc-aix-port-dev
mailing list