question about calling convention changes
Christian Thalinger
christian.thalinger at oracle.com
Tue Sep 10 11:32:08 PDT 2013
On Sep 9, 2013, at 1:17 PM, "Venkatachalam, Vasanth" <Vasanth.Venkatachalam at amd.com> wrote:
> Thanks for the reply. I was actually about to reply to my own email.
>
> Upon closer inspection, I think the real solution here is to add support for a CallingConvention model that's based on parameters being passed via an arg segment.
>
> Currently Graal supports either a register-based model or a stack-based model, but HSAIL uses a different model altogether. Function params are not passed in registers or on the stack. They are instead passed via an Arg segment in memory, as described in my earlier email below. So rather than trying to get all this to work using the existing approach of stackslots, I'm proposing to add the support for an arg-segment based calling convention.
>
> I think what this will require will be to override HSAILRegisterConfig.getCallingConvention( ) to support this third type of calling convention. The basic idea would be to provide a numbering scheme to keep track of each of the function parameters (instead of using stack slots). The rest of Graal would need to be made aware of these parameters and their locations so that optimization phases don't do anything funny with them.
>
> We'd welcome people's feedback on whether this approach sounds reasonable. And Bharadwaj, I'll examine the PTX files you mentioned to see if I can leverage anything from there.
Maybe adding another calling convention type isn't so bad after all but I'd need to see some code how this would be handled.
The easiest way I think would be to do the regular register calling convention and map it to memory slots in the argument segment. You are already doing a lot of parameter related stuff in HSAILBackend.emitCode. Since you have to load them into registers anyway you can use the allocated registers right away.
-- Chris
>
> Vasanth
>
> From: Bharadwaj Yadavalli [mailto:bharadwaj.yadavalli at oracle.com]
> Sent: Monday, September 09, 2013 2:38 PM
> To: Venkatachalam, Vasanth
> Cc: graal-dev at openjdk.java.net
> Subject: Re: question about calling convention changes
>
> Hi Vasanth,
> On 9/9/2013 12:27 PM, Venkatachalam, Vasanth wrote:
>
> I'm looking for some guidance on changes I'm implementing to get my HSAIL backend to use a calling convention that's more aligned with the convention that HSAIL uses.
>
> I have not seen the HSAIL backend implementation. I am assuming that you have defined a AMD64HotSpotGraalRuntime, CallingConvention and register configuration etc to represent HSAIL.
>
> <...>
>
> The issues you appear to be addressing during HSAIL codegen seem similar to those that we've been working on for PTX codegen - although, at present, we only emit code for PTX _kernel_ entry prologue (we do not generate PTX _functions_ yet but would expect the basic technique to be similar).
>
>
> As expected, Graal is creating MoveToRegOp nodes in emitPrologue( ) for each for the function params with the above Boolean param set to true.
> However, it isn't generating the proper code for the very last param. What I find is that for the last param, emitCode( ) is calling HSAILMove.move( ) with the isPrologue parameter set to false.
>
> Debugging this issue I found that somewhere down the road, Graal is inserting a moveToReg node (which by default has the isPrologue field set to false) in the cnain of calls MoveResolver.insert( ) -> spillMoveFactory.createMove( ). I suspect this extraneously added movetoRegOp may be the culprit, but I'm not sure how to work around this. Any suggestions?
>
>
> Not sure where that additional moveToReOp is coming from but one place that I encountered such situation is visitReturn() and have an overridden PTXLIRGenerator::visitReturn() to handle this.
>
>
> Also,
>
>
> 1) Is my overall approach correct for what I'm trying to do?
>
> I think overriding emitPrologue() to provide the appropriate implementation for HSAIL architecture is the right way to handle the issue.
>
>
>
> 2) In the case of a kernel function, the last parameter has to be treated specially. Instead of generating a load for it as above, we need to generate a workitemabsid instruction. Do people have suggestions on where I can add some logic to intercept the last parameter and treat it special when emitting code for the prologue?
>
>
> You may find it informative to look at PTXILIRGenerator()::emitPrologue() and PTXLIRGenerator::visitReturn().
>
> Hope that helps.
>
> Bharadwaj
More information about the graal-dev
mailing list