addExact intrinsic

Tue Dec 10 06:57:14 PST 2013

On 10/12/13 11:46, Andrew Haley wrote:
> I've been looking at addExact for AArch64, and I think I've got it
> right, but it looks rather odd.
> 
> As far as I can see the output of addExact is fixed to a named
> register and to the FlagsReg.  Matcher uses these functions to match
> the output of addExact:
> 
> const RegMask Matcher::mathExactI_result_proj_mask() {
>   return R0_REG_mask();
> }
> 
> const RegMask Matcher::mathExactI_flags_proj_mask() {
>   return INT_FLAGS_mask();
> }
> 
> And these are the instruction patterns:
> 
> instruct addExactI_reg(iRegIorL2I src1, iRegIorL2I src2, rFlagsReg cr)
> %{
>   match(AddExactI src1 src2);
>   effect(DEF cr);
> 
>   format %{ "addsw    r0, $src1, $src2\t# addExact int" %}
>   ins_encode %{
>     __ addsw(r0,
>             as_Register($src1$$reg),
>             as_Register($src2$$reg));
>   %}
>   ins_pipe(pipe_class_default);
> %}
> 
> instruct addExactI_reg_imm(iRegIorL2I src1, immI src2, rFlagsReg cr)
> %{
>   match(AddExactI src1 src2);
>   effect(DEF cr);
> 
>   format %{ "addsw    r0, $src1, $src2\t# addExact int" %}
>   ins_encode %{
>     __ addsw(r0,
>             as_Register($src1$$reg),
>             as_Register($src2$$constant));
>   %}
>   ins_pipe(pipe_class_default);
> %}
> 
> Is that reasonable?

Well, it looks correct to me but I think it could probably be improved :-)

The ideal graph constructed by LibraryCallKit::inline_math_mathExact
makes it clear how this works. That method feeds the output of the
AddExactI node through a pair of proj nodes, with projections whose
possible output register assignments are defined by the two masks
mathExactI_result_proj_mask() and mathExactI_flags_proj_mask() (see
method AddExactINode::match in mathexactnode.cpp to identify wher ethe
masks are used)

Reverting back to inline_math_mathExact, the CR register is projected
into a bool test with condition 'overflow' and the result register is
projected into the result output. The Bool's IfTrue feeds into a
re-executable uncommon_trap with reason Deoptimization::Reason_intrinsic
and action none. The IfFalse drives control to set_result (using the
value from the AddExactI output projection).

The Matcher mask methods defined above ensure that the result can only
be allocated as R0 and the CR output as an int flags register. So, with
this definition when trying to satisfy the register allocation for the
AddExact node the the chaitin code will spill any active value in R0
before the AddExact code executes and only allow R0 to be used to define
the allocation for the downstream result node.

It would be better for AArch64 if we allowed this result to go into any
of the normal (i.e. iRegINoSp) integer registers managed by the allocator.

I think we can achieve this if we change the result mask function to

  const RegMask Matcher::mathExactI_result_proj_mask() {
    return _NO_SPECIAL_REG32_mask;
  }

and then use the following rules

  instruct addExactI_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I
src2, rFlagsReg cr)
  %{
    match(Set dst (AddExactI src1 src2));
    effect(DEF cr);

    format %{ "addsw    $dst, $src1, $src2\t# addExact int" %}
    ins_encode %{
      __ addsw(as_Register($dst$$reg),
               as_Register($src1$$reg),
               as_Register($src2$$reg));
    %}
    ins_pipe(pipe_class_default);
  %}

  instruct addExactI_reg_imm(iRegIorL2I src1, immI src2, rFlagsReg cr)
  %{
    match(AddExactI src1 src2);
    effect(DEF cr);

    format %{ "addsw    $dst, $src1, $src2\t# addExact int" %}
    ins_encode %{
      __ addsw(as_Register($dst$$reg),
              as_Register($src1$$reg),
              as_Register($src2$$constant));
    %}
    ins_pipe(pipe_class_default);
  %}

This ought to be legitimate but it would be good to hear confirmation
from someone else on this list.

regards,

Andrew Dinn
-----------