RFR: 8254872: Optimize Rotate on AArch64

Eric Liu eric.c.liu at arm.com
Mon Nov 16 04:14:47 UTC 2020


Hi John,

Thanks for your review. I'd like to take some time to investigate the weakness.


B&R
Eric
-----------------------------------------------------------------------------------


From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of John Rose <john.r.rose at oracle.com>
Sent: 14 November 2020 2:43
To: Eric Liu <github.com+10482586+erik1iu at openjdk.java.net>
Cc: hotspot-compiler-dev at openjdk.java.net <hotspot-compiler-dev at openjdk.java.net>
Subject: Re: RFR: 8254872: Optimize Rotate on AArch64 
 
On Nov 13, 2020, at 2:39 AM, Eric Liu <github.com+10482586+erik1iu at openjdk.java.net> wrote:
> 
> This patch is a supplement for
> https://bugs.openjdk.java.net/browse/JDK-8248830.
> 
> With the implementation of rotate node in IR, this patch:
> 
> 1. canonicalizes RotateLeft into RotateRight when shift is a constant,
>   so that GVN could identify the pre-existing node better.
> 2. implements scalar rotate match rules and removes the original
>   combinations of Or and Shifts on AArch64.
> 
> This patch doesn't implement vector rotate due to the lack of
> corresponding vector instructions on AArch64.
> 
> Test case below is an explanation for this patch.
> 
>        public static int test(int i) {
>            int a =  (i >>> 29) | (i << -29);
>            int b = i << 3;
>            int c = i >>> -3;
>            int d = b | c;
>            return a ^ d;
>        }

Because of shift-count masking, this parses to nodes
equivalent to:

       public static int test(int i) {
           int a =  (i >>> 29) | (i << 3);
           int d = (i << 3) | (i >>> 29);
           // not detected: a == d  
           int r = a ^ d;
           // not detected: r == 0
           return r;
       }

If we were to work a little harder at canonicalizing
commutative expressions in IGVN, we could detect
that a==d.  (See AddNode::Ideal.)  It’s tempting to
pull on this very long string, but it’s not clear when
to stop, if not now.

In this case the better road is to canonicalize both
a and d to the same rotate node.  But maybe there’s
some benefit in reordering x|y|z and x^y^z when
x and z could combine to a rotate node.  (This isn’t
your problem!)

> Before:
> 
>        lsl     w12, w1, #3
>        lsr     w10, w1, #29
>        add     w11, w10, w12
>        orr     w12, w12, w10
>        eor     w0, w11, w12
> 
> After:
> 
>        ror     w10, w1, #29
>        eor     w0, w10, w10

Amazingly, w10^w10 does not GVN to zero!

Your test appears to rely on that weakness.
I think the weakness should be fixed in a
separate investigation.

Anyway, none of these remarks reflects
on your patch.

— John


More information about the hotspot-compiler-dev mailing list