question about calling convention changes
Venkatachalam, Vasanth
Vasanth.Venkatachalam at amd.com
Mon Sep 9 09:27:50 PDT 2013
Hi,
I'm looking for some guidance on changes I'm implementing to get my HSAIL backend to use a calling convention that's more aligned with the convention that HSAIL uses.
In HSAIL, function parameters are passed in memory (in an Arg segment) instead of in registers. In the prologue of the function, the caller then loads these arguments from the Arg segment into registers. The code of the function prologue would ook like this:
function &bar (arg_s32 %_result) (
arg_s32 %_arg0,
arg_s32 %_arg1
) {
ld_arg_s32 $s0, [%_arg0]; //load arguments into registers.
ld_arg_s32 $s1, [%_arg1];
To put things in context, the function would be called like this:
{
arg_s32 %_methodArg_0; //declare memory for the first param
arg_s32 %_methodArg_1; //declare memory for the second param
arg_s32 %_outVal; //declare memory for the result
st_arg_s32 $s0, [%_methodArg_0]; //store to the memory location for the first param
st_arg_s32 $s1, [%_methodArg_1]; //store to the memory location for the second param
call &bar (%_outVal) (%_methodArg_0, %_methodArg_1); //call the function.
ld_arg_s32 $s0, [%_outVal]; //load the result.
}
As a first step to have Graal emit the proper code in the function prologue, I switched it to use a stack-based calling convention instead of register based. To do this I modified getHSAILCompilationResult to call registerConfig.getCallingConvention( ...Boolean stackOnly) with the stackOnly parameter set to true. When I first made this change, Graal was generating the above loads from the stack instead of the arg segments. So I added extra logic to intercept the loads that are happening in the function prologue and treat them specially so that they load from the arg segment instead of the stack.
To do this, I overrode LIRGenerator.emitPrologue() inside HSAiLLIRGenerator so that it passes a Boolean param isPrologue to the emitMove( ) function, which in turn invokes the MoveToRegOp() constructor with the same param, indicating that the move belongs to the prologue. I then made some changes in HSAILMove.emitCode( ) so that it passes this Boolean arg to the code generation routine HSAILMove.move (...Boolean isPrologue ), so that these moves result in a different codegen.
As expected, Graal is creating MoveToRegOp nodes in emitPrologue( ) for each for the function params with the above Boolean param set to true.
However, it isn't generating the proper code for the very last param. What I find is that for the last param, emitCode( ) is calling HSAILMove.move( ) with the isPrologue parameter set to false.
Debugging this issue I found that somewhere down the road, Graal is inserting a moveToReg node (which by default has the isPrologue field set to false) in the cnain of calls MoveResolver.insert( ) -> spillMoveFactory.createMove( ). I suspect this extraneously added movetoRegOp may be the culprit, but I'm not sure how to work around this. Any suggestions?
Also,
1) Is my overall approach correct for what I'm trying to do?
2) In the case of a kernel function, the last parameter has to be treated specially. Instead of generating a load for it as above, we need to generate a workitemabsid instruction. Do people have suggestions on where I can add some logic to intercept the last parameter and treat it special when emitting code for the prologue?
Vasanth
More information about the graal-dev
mailing list