RFR: 8254872: Optimize Rotate on AArch64

Eric Liu eric.c.liu at arm.com
Thu Nov 19 09:50:40 UTC 2020


Hi John,

I thought about this a bit more.
 
> From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of John Rose <john.r.rose at oracle.com>
> Sent: 14 November 2020 2:43
> To: Eric Liu <github.com+10482586+erik1iu at openjdk.java.net>
> Cc: hotspot-compiler-dev at openjdk.java.net <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR: 8254872: Optimize Rotate on AArch64 
>  
> On Nov 13, 2020, at 2:39 AM, Eric Liu <github.com+10482586+erik1iu at openjdk.java.net> wrote:
> > 
> > This patch is a supplement for
> > https://bugs.openjdk.java.net/browse/JDK-8248830.
> > 
> > With the implementation of rotate node in IR, this patch:
> > 
> > 1. canonicalizes RotateLeft into RotateRight when shift is a constant,
> >   so that GVN could identify the pre-existing node better.
> > 2. implements scalar rotate match rules and removes the original
> >   combinations of Or and Shifts on AArch64.
> > 
> > This patch doesn't implement vector rotate due to the lack of
> > corresponding vector instructions on AArch64.
> > 
> > Test case below is an explanation for this patch.
> > 
> >        public static int test(int i) {
> >            int a =  (i >>> 29) | (i << -29);
> >            int b = i << 3;
> >            int c = i >>> -3;
> >            int d = b | c;
> >            return a ^ d;
> >        }
> 
> Because of shift-count masking, this parses to nodes
> equivalent to:
> 
>        public static int test(int i) {
>            int a =  (i >>> 29) | (i << 3);
>            int d = (i << 3) | (i >>> 29);
>            // not detected: a == d  
>            int r = a ^ d;
>            // not detected: r == 0


>            return r;
>        }
> 
> If we were to work a little harder at canonicalizing
> commutative expressions in IGVN, we could detect
> that a==d.  (See AddNode::Ideal.)  It’s tempting to
> pull on this very long string, but it’s not clear when
> to stop, if not now.
> 
> In this case the better road is to canonicalize both
> a and d to the same rotate node.  But maybe there’s
> some benefit in reordering x|y|z and x^y^z when
> x and z could combine to a rotate node.  (This isn’t
> your problem!)
> 
> > Before:
> > 
> >        lsl     w12, w1, #3
> >        lsr     w10, w1, #29
> >        add     w11, w10, w12
> >        orr     w12, w12, w10
> >        eor     w0, w11, w12
> > 
> > After:
> > 
> >        ror     w10, w1, #29
> >        eor     w0, w10, w10
> 
> Amazingly, w10^w10 does not GVN to zero!

I think this was cased by the lack of Ideal on Xor 
node. Perhaps we can add some rules for it:

1). x ^ x ==> 0, this can solve the above issue.
2). Const ^ x ==> x ^ Const, so that GVN could replace with
    the pre-existed node. 
3). x ^ y ==> x,  if y is constant zero.
4). x ^ y ==> ~x, if y is constant bit mask value.

> 
> Your test appears to rely on that weakness.
> I think the weakness should be fixed in a
> separate investigation.
> 
> Anyway, none of these remarks reflects
> on your patch.
> 
> — John


Thanks,
Eric


More information about the hotspot-compiler-dev mailing list