[aarch64-port-dev ] Behaviour of small aarch64 simulator vs real hardware wrt fcvtzs

Tue Nov 5 09:35:35 PST 2013

After some of the FP tests failed on C2 I started looking at the
documentation of the various float to int conversion routines and
comparing it with our small aarch64 sim code.

The docs say:

The instructions raise the Invalid Operation exception (FPSR.IOC) in
response to a floating point input of NaN, Infinity, or a numerical
value that cannot be represented within the destination register. An out
of range fixed-point result will also be saturated to the destination
size. A numeric result which differs from the input will raise the
Inexact exception (FPSR.IXC). When flush-to-zero mode is enabled a
denormal input will be replaced by a zero and will raise the Input
Denormal exception (FPSR.IDC).

Our sim method which implements "fcvtsz Rd, Sn" is as follows:

void AArch64Simulator::fcvtszs32()
{
  // instr[9,5] = sn
  // instr[4,0] = rd
  // TODO : check that this rounds toward zero
  float f = sreg(5);
  int32_t value;
  // check for FP exception conditions
  // NaN raises IO
  // Infinity raises IO
  // out of range raises IO and IX and saturates value
  // denormal raises ID and IX and sets to zero
  switch (fpclassify(f)) {
  case FP_INFINITE:
  case FP_NAN:
    cpuState.setFPSR(FPSRRegister::IO_IDX);
    if (signbit(f)) {
      value = INT_MAX;
    } else {
      value = INT_MIN;
    }
    break;
  case FP_NORMAL:
    if (f >= FLOAT_INT_MAX) {
      cpuState.setFPSRBits(FPSRRegister::IO | FPSRRegister::IX,
FPSRRegister::IO | FPSRRegister::IX);
      value = INT_MAX;
    } else if (f <= FLOAT_INT_MIN) {
      cpuState.setFPSRBits(FPSRRegister::IO | FPSRRegister::IX,
FPSRRegister::IO | FPSRRegister::IX);
      value = INT_MIN;
    } else {
      value = (int32_t)f;
    }
    break;
  case FP_SUBNORMAL:
    cpuState.setFPSRBits(FPSRRegister::IO | FPSRRegister::IX|
FPSRRegister::ID, FPSRRegister::IO | FPSRRegister::IX| FPSRRegister::ID);
    value = 0;
    break;
  case FP_ZERO:
    value = 0;
    break;
  }
  // avoid sign extension to 64 bit;
  xreg(0, NO_SP) = (u_int32_t)value;
}

The other 3 conversions follow the same pattern.

Now, clearly, the test of the sign bit for case FP_INFINITE is borked
and should be as follows

    . . .
    if (signbit(f)) {
      value = INT_MIN;
    } else {
      value = INT_MAX;
    }
    . . .

I say that this is wrong for case FP_INFINITE and not for both cases
because I am not clear what the hardware is supposed to do for case
FP_NAN. When converting a NaN should the output value be saturated to
INT_MIN/MAX, depending on the sign of the NaN? Or should it be set to zero?

If the latter alternative is the expected behaviour the this is exactly
what Java wants. That would mean we don't need to check the msr for fp
exceptions when we plant code for f2i, d2i etc. We can just execute
"fcvtzs Rd, Sn" and be done with it.

Does anyone have any input regarding the correctness of the other cases?
In particular, are the correct FP flags being set in all cases?

regards,

Andrew Dinn
-----------