Vector API Get Implementation patch

Fri Jun 22 20:41:06 UTC 2018

Thanks Vladimir,

regards,
Rahul

From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
Sent: Friday, June 22, 2018 1:31 PM
To: Kandu, Rahul <rahul.kandu at intel.com>
Cc: panama-dev at openjdk.java.net
Subject: Re: Vector API Get Implementation patch

http://cr.openjdk.java.net/~vdeshpande/webrevGetImplF/webrev.01/

extract1d and extract2f should use vecD as well.

Otherwise, looks good. No need to send new webrev.

Best regards,

Vladimir Ivanov

1) added extract1d, extract2f, extract1l
2) modified the source to vecD from vecX for the following instruct that use 64-bit vector operands as source.
  +instruct extract2i(rRegI dst, vecD src, immI idx) %{
 +instruct extract4s(rRegI dst, vecD src, immI idx) %{
 +instruct extract8b(rRegI dst, vecD src, immI idx) %{
also extract1l uses vecD as source  // instruct extract1l (rRegI dst, vecD src, immI idx)

regards,
Rahul

-----Original Message-----
From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On Behalf Of Kandu, Rahul
Sent: Wednesday, June 20, 2018 6:08 PM
To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com<mailto:vladimir.x.ivanov at oracle.com>>; panama-dev at openjdk.java.net<mailto:panama-dev at openjdk.java.net>
Subject: RE: Vector API Get Implementation patch

Yes, it should be guarded by the "$dst$$reg != $src$$reg".

-----Original Message-----
From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
Sent: Wednesday, June 20, 2018 5:38 PM
To: Kandu, Rahul <rahul.kandu at intel.com<mailto:rahul.kandu at intel.com>>; panama-dev at openjdk.java.net<mailto:panama-dev at openjdk.java.net>
Subject: Re: Vector API Get Implementation patch

I left those out initially because extract1D/1L is going to be scalar, I can include them.
I think extract1D may look something like this:

Double64Vector/Long64Vector are still there and even though they are 1-element vectors, the operations have to be intrinsified to get reliable vectorbox elimination.

Yeah, a plain move should be enough. Shouldn't it be also guarded by "$dst$$reg != $src$$reg"?

Another observation: shouldn't 64-bit vector operands be represented by VecD and not VecX?

The same applies to:

+instruct extract2i(rRegI dst, vecX src, immI idx) %{

+instruct extract4s(rRegI dst, vecX src, immI idx) %{

+instruct extract8b(rRegI dst, vecX src, immI idx) %{

Best regards,
Vladimir Ivanov

instruct extract1d(regD dst, vecX src, immI idx) %{
  predicate(UseAVX > 0 && n->as_Vector()->length() == 1 && n->bottom_type()->basic_type() == T_DOUBLE);
  match(Set dst (ExtractD src idx));
  ins_encode %{
    int vector_len = 0;
    int midx = 0x1 & $idx$$constant;
    __ vmovdqu($dst$$XMMRegister, $src$$XMMRegister);
  %}
  ins_pipe( pipe_slow );
%}

-----Original Message-----
From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
Sent: Wednesday, June 20, 2018 3:38 PM
To: Kandu, Rahul <rahul.kandu at intel.com<mailto:rahul.kandu at intel.com>>; panama-dev at openjdk.java.net<mailto:panama-dev at openjdk.java.net>
Subject: Re: Vector API Get Implementation patch

I'll add the extract2f. Thanks for the suggestion. Let me send another webrev shortly with the extract2f.

What about extract1d & extract1l? Is there any reason why they aren't needed?

I'm not quite sure why missing intrinsic(s) are not caught by tests- even so earlier when ExtractI rules were missing. I can find out.

Yes, please.

Best regards,
Vladimir Ivanov

-----Original Message-----
From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
Sent: Wednesday, June 20, 2018 2:54 PM
To: Kandu, Rahul <rahul.kandu at intel.com<mailto:rahul.kandu at intel.com>>; panama-dev at openjdk.java.net<mailto:panama-dev at openjdk.java.net>
Subject: Re: Vector API Get Implementation patch

Rahul,

Please find the updated patch for the Get Implementation
http://cr.openjdk.java.net/~vdeshpande/webrevGetImplF620/webrev/

in this patch, I made the following modifications.

1. fixed AVX-512 rules with predicate "UseAVX >2"
2. added ExtractI rules
3. fixed tabulation issues in the instructs in the x86.ad file.
4. Includes all test types- all tests passed, including Int128/256/512 etc.
5. removed extract4b rule

Thanks. I noticed that some 64-bit variants are missing as well while JDK implementation still calls the intrinsic [1]:

    extract1d, extract2f, extract1l

Do you have an idea why the tests don't catch that? I believe intinsics don't kick in, but I don't see why (just by staring at the code).

Otherwise, looks good.

This patch covers for AVX rules.

If its fine, I can add the Non-AVX rules (UseAVX==0 && UseSSE>3) in a patch for all the types and submit it separately.

Sounds good.

Best regards,
Vladimir Ivanov

[1]
src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Double64Vector.java:

      public double get(int i) {
          if (i < 0 || i >= LENGTH) {
              throw new IllegalArgumentException("Index " + i + " must be zero or positive, and less than " + LENGTH);
          }
          long bits = (long) VectorIntrinsics.extract(
                                  Double64Vector.class, double.class, LENGTH,
                                  this, i,
                                  (vec, ix) -> {
                                      double[] vecarr = vec.getElements();
                                      return (long)Double.doubleToLongBits(vecarr[ix]);
                                  });
          return Double.longBitsToDouble(bits);
      }

-----Original Message-----
From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On
Behalf Of Kandu, Rahul
Sent: Monday, June 18, 2018 8:58 AM
To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com<mailto:vladimir.x.ivanov at oracle.com>>;
panama-dev at openjdk.java.net<mailto:panama-dev at openjdk.java.net>
Subject: RE: Vector API Get Implementation patch

Vladimir,

Thanks for your input. I'll upload the updated patch shortly.

Rahul

-----Original Message-----
From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
Sent: Thursday, June 14, 2018 9:44 AM
To: Kandu, Rahul <rahul.kandu at intel.com<mailto:rahul.kandu at intel.com>>;
panama-dev at openjdk.java.net<mailto:panama-dev at openjdk.java.net>
Subject: Re: Vector API Get Implementation patch

Rahul,

Some comments/questions/ suggestions:

     (1) I don't see any non-AVX rules. Do you plan to add them later?

     (2) No matcher rules for ExtractI as well. Same question: left for later?

     (3) AVX-512 rules should be guarded by "UseAVX > 2" predicate:

+instruct extract32s(rRegI dst, vecZ src, vecZ tmp, immI idx) %{
+  predicate(UseAVX > 0 && n->as_Vector()->length() == 32 &&
n->bottom_type()->basic_type() == T_SHORT);

     (4) Is extract4b ever used? I don't see any 32-bit vectors present.

+instruct extract4b(rRegI dst, vecX src, immI idx) %{
+  predicate(UseAVX > 0 && n->as_Vector()->length() == 4 &&
n->bottom_type()->basic_type() == T_BYTE);

      (5) I suggest to migrate to immU? (recently introduced) and
get rid of index masking (see [1] for examples):
      +    int midx = 0x3F & $idx$$constant;

Applying similar refactorings to rvinsert64B, the following:

+instruct extract64b(rRegI dst, vecZ src, vecZ tmp, immI idx) %{
...
+    int midx = 0x3F & $idx$$constant;
+    if (midx == 0) {
+      __ movdl($dst$$Register, $src$$XMMRegister);
+          __ movsbl($dst$$Register, $dst$$Register);
+    }
+    else if (midx >= 1 && midx <= 15) {
+      __ pextrb($dst$$Register, $src$$XMMRegister, midx);
+          __ movsbl($dst$$Register, $dst$$Register);
+    }
+    else {
+      int extr_idx1 = midx / 16;
+      int extr_idx2 = midx % 16;
+      __ vextracti32x4($tmp$$XMMRegister, $src$$XMMRegister, extr_idx1);
+      __ pextrb($dst$$Register, $tmp$$XMMRegister, extr_idx2);
+          __ movsbl($dst$$Register, $dst$$Register);
+    }

can be converted to into:

+instruct extract64b(rRegI dst, vecZ src, vecZ tmp, immU6 idx) %{
...
+    uint x_idx = $idx$$constant & right_n_bits(4);
+    uint z_idx = ($idx$$constant >> 4) & 2;
+    if (z_idx == 0) {
+      if (x_idx == 0) {
+        __ movdl($dst$$Register, $src$$XMMRegister);
+      } else {
+        __ pextrb($dst$$Register, $src$$XMMRegister, x_idx);
+      }
+    } else { // z_idx > 0
+      __ vextracti32x4($tmp$$XMMRegister, $src$$XMMRegister, z_idx);
+      __ pextrb($dst$$Register, $tmp$$XMMRegister, x_idx);
+    }
+    __ movsbl($dst$$Register, $dst$$Register);

or even:

+    uint x_idx = $idx$$constant & right_n_bits(4);
+    uint z_idx = ($idx$$constant >> 4) & 2;
+    Register r_i32 = $src$$XMMRegister;
+    if (z_idx > 0) {
+      r_i32 = $tmp$$XMMRegister;
+      __ vextracti32x4(r_i32, $src$$XMMRegister, z_idx);
+    }
+    if (x_idx == 0) {
+      __ movdl($dst$$Register, r_i32);
+    } else {
+      __ pextrb($dst$$Register, r_i32, x_idx);
+    }
+    __ movsbl($dst$$Register, $dst$$Register);

Also, please, fix code formatting. Many lines have wrong indentation.

Best regards,
Vladimir Ivanov

[1] http://hg.openjdk.java.net/panama/dev/rev/e00750ed585a

On 14/06/2018 03:10, Kandu, Rahul wrote:
Hi All,

Please find the Get Implementation patch for the Vector API. The following patch also includes Tests for Get operations.

http://cr.openjdk.java.net/~rlupusoru/panama/webrevrGetImplF/

regards,
Rahul