[aarch64-port-dev ] [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics, approval request

Thu Feb 28 06:45:09 UTC 2019

Hi Vladimir, Jatin and All,

> So I have question for aarch64 developers. Are aarch64 fmin/fmax
> instructions are always faster than code generated by default? If this is true
> new conditions should be x86 specific. To have a separate function to do
> these checks. We have precedent - clear_upper_avx(). May be later we have
> to add other conditions for other platforms too.

I am the author of original AArch64 fmin/fmax intrinsics patch[1], but not a reviewer.

Both Andrew Haley and I have tested the performance of AArch64 fmin/fmax instructions before. As far as I could remember, the result is similar to what we have seen here on x86. If selecting the min/max values from an array of random numbers, fmin/fmax instructions show better performance. But for an already (almost) sorted array, fmin/fmax instructions do make the performance worse, but not too much. So personally I think, adding heuristic in shared code would benefit AArch64 as well.

I didn't quite understand Jatin's additional code below.
--
+#ifdef X86
+  // Being conservative since all the phi edges may not be set
+  // by now. This is done to skip over reduction scenarios. 
+  if (a->is_Phi() || b->is_Phi())
+    return false;
+#endif
--
Is it going to black out *all* reduction scenarios? I see the intrinsics benefit the reduction in some cases. And in my opinion, adding this kind of platform-dependent macros in hotspot shared code is not so good.

[1] http://hg.openjdk.java.net/jdk/jdk/rev/f15af1e2c683

--
Thanks,
Pengfei