RFR: aarch32: softfp patch for template interpreter

Wed Oct 19 11:41:47 UTC 2016

Hi yangyongyong,

On Mon, 2016-10-17 at 12:53 +0000, yangyongyong wrote:
>  
> Attached is a softfp patch for template interpreter. Two pairs of macros are defined in globalDefinitions_aarch32.hpp to isolate softfp code: HARD_FLOAT_CC/ SOFT_FLOAT_CC and HARD_FLOAT/ SOFT_FLOAT.

Attachments are stripped when sent to the OpenJDK mailing lists. If you want to post patches you should put them on a web site and post a link.

I have put your patch at

http://cr.openjdk.java.net/~enevill/softfp/161014-softfp.hg.diff

so others can review it.

I have looked through the patch and have a few comments.

> HARD_FLOAT_CC/SOFT_FLOAT_CC defines C calling convention. HARD_FLOAT are used to indicate that a platform can take advantage of hardfp instructions regardless of its calling convention, while SOFT_FLOAT means hardfp instructions are not supported or are disallowed to use.
>  
> Only three combinations are possible configurations for ARM platforms:
> 1. HARD_FLOAT/ HARD_FLOAT_CC are defined when __ARM_PCS_VFP is defined.
> 2. HARD_FLOAT/ SOFT_FLOAT_CC are defined when gcc runs with "-mfloat-abi=softfp" option.
> 3. SOFT_FLOAT/ SOFT_FLOAT_CC when gcc runs with "-mfloat-abi=soft" option.

Some instructions on how to build the different variants would be useful.

Should I define -mfloat-abi=softfp or -mfloat-abi=soft in CFLAGS, or CXXFLAGS or should I use the --with-extra-cflags=XXX, or --with-extra-cxxflags=XXX. Should I build with --with-jvm-variant=core, or will it work if I build with --with-jvm-variant=client and then use -Xint to run only the template interpreter.

I can probably work this out myself, as can most readers here, but why should I? If you want people to build/test/review your patch give clear instructions on how to build with it.

Some comments on the patch itself

@@ -1733,6 +1742,18 @@
   if(imm_instr(decode, Rd, Rn, imm, cond, s)) return;
   if(imm_instr(cpart, Rd, Rn, -imm, cond, s)) return;

+  if (Rd == Rn) {
+    // TODO: may use 4 adds here, need optimization.
+    int i;
+    for (i = 0;i < 4;i++) {
+      uint32_t imm8 = (imm & (0xFF << (i * 8)));
+      if (imm8 == 0)
+        continue;
+      bool ret = imm_instr(decode, Rd, Rd, imm8, cond, s);
+      assert(ret, "should be");
+    }
+    return;
+  }


This is patching the following in assembler_aarch32.cpp

void Assembler::add_sub_imm(int decode, Register Rd, Register Rn, int imm,
                            Condition cond, bool s) {

I assume you have added this code because in places you do call add_sub_imm with immediates which cannot be handled in a single instruction and with Rd==Rn.

However, this is not correct if the S bit is set. Eg.

  SUBS	Rd, Rd, #0xcafebabe

does not set the flags in the same way as

  SUBS	Rd, Rd, #0xbe
  SUBS  Rd, Rd, #0xba00
  SUBS	Rd, Rd, #0xfe0000
  SUBS	Rd, Rd, #0xca000000

At the very least this needs an assert that you are not trying to set the flags (ie. s==false). However, I think it would be prefereable for the caller to pass in a temp which can be used in this case as it will generate better code using movw/movt/add. The temp can be option and set to noreg by default.


   //TODO Misc instructions
+  void clz(Register Rd, Register Rm, Condition cond = C_DFLT) {
+    starti;
+    f(cond, 31, 28), f(0b000101101111, 27, 16);
+    rf(Rd, 12), f(0b11110001, 11, 4);
+    rf(Rm, 0);
+  }
+

I can't find any usage of this clz method. Is there meant to be any?

+void swap_register(Register r1, Register r2) {
+  eor(r1, r1, r2);
+  eor(r2, r1, r2);
+  eor(r1, r1, r2);
+}

I dislike this use of eor to swap regs. It is possible to use a temp. mov tmp, r1; mov r1, r2; mov r2, tmp is clearer and may also be more efficient on some implementations.


I am in the process of building and testing this which will take a few days as there are quite a few combinations (hardfp/softfp, core/client, release/debug) and I want to be sure that all combinations still work and nothing has been broken.

It would be really helpful if someone else could help review this patch as well as it is quite a big patch.

All the best,
Ed.