[PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics

Mon Feb 4 11:42:51 UTC 2019

Hi Jatin,

On Mon, 4 Feb 2019 at 03:15, Bhateja, Jatin <jatin.bhateja at intel.com> wrote:
>
> Hi Blaser,
>
> Please find response embedded in following mail.
>
> Regards,
> Jatin
>
> -----Original Message-----
> From: B. Blaser [mailto:bsrbnd at gmail.com]
> Sent: Monday, February 4, 2019 1:25 AM
> To: Viswanathan, Sandhya <sandhya.viswanathan at intel.com>
> Cc: hotspot-compiler-dev at openjdk.java.net; Vladimir Kozlov <vladimir.kozlov at oracle.com>; Bhateja, Jatin <jatin.bhateja at intel.com>; Deshpande, Vivek R <vivek.r.deshpande at intel.com>
> Subject: Re: [PATCH] 8217561 : X86: Add floating-point Math.min/max intrinsics
>
> One more note about the rules you added in x86_64.ad to handle legacy registers.
>
> Consider the following example:
>
> class MinMax {
>     public static void main(String... args) {
>         System.out.println(test());
>     }
>     public static double test() {
>         double d = 0.0;
>         for (; d < 1000000.0; d = java.lang.Math.max(d, d+1)) ;
>         return d;
>     }
> }
>
> Running it like this:
>
> $ java -XX:+PrintOptoAssembly -Xcomp -XX:+InlineIntrinsics -XX:-TieredCompilation -XX:CompileOnly=MinMax::test MinMax
> gives:
>
> movsd   XMM5, [rsp + #0]    # spill
> vaddsd  XMM1, XMM5, [constant table base + #0]    # load from constant
> table: double=#1.000000
> movsd XMM1,XMM1    ! load double (8 bytes)
> blendvpd         XMM3,XMM1,XMM5,XMM1
> blendvpd         XMM2,XMM5,XMM1,XMM1
> vmaxpd           XMM4,XMM2,XMM3
> cmppd.unordered  XMM3, XMM2, XMM2
> blendvpd         XMM1,XMM4,XMM2,XMM3
>
> Unless I missed anything, I guess 'movsd XMM1, XMM1' is useless?
>
> JATIN >> It's useless in this case, but its micro operation will not contend for execution unit in backend.
>
> You should probably be able to simply get rid of 'legRegD' as you explicitly use XMM registers (for example '$a$$XMMRegister').
>
> JATIN >> blend's instruction uses VEX encoding which constrains it from accessing XMM16-XMM31. Usage of new register class legRegD
>                 will constrain register allocator to allocate a legal physical register as per its encoding and in that case this move instruction will be useful.

Right.

> Please also verify the parameters of the test case, given the example above, I'm not sure the intrinsic is really inlined as expected, I suspect 'Math.max()' is simply compiled as before.
>
> JATIN >> Intrinsification is indeed happening in the test you provided.

Yes, of course, but I meant:

http://cr.openjdk.java.net/~sviswanathan/Jatin/8217561/webrev.01/test/hotspot/jtreg/compiler/intrinsics/math/TestFpMinMaxIntrinsics.java.html

I believe '-XX:CompileOnly=java/lang/Math' induce the compilation of
'Math.max()' but not its intrinsification which occurs when the call
is inlined, see:

http://hg.openjdk.java.net/jdk/jdk/rev/f15af1e2c683#l9.29

http://hg.openjdk.java.net/jdk/jdk/file/d997c227e968/src/hotspot/share/opto/library_call.cpp#l6611

But this would be good if someone knowledgeable in this area could confirm this?

Bernard