RFR: 8254872: Optimize Rotate on AArch64
Eric Liu
eric.c.liu at arm.com
Thu Nov 19 09:50:40 UTC 2020
Hi John,
I thought about this a bit more.
> From: hotspot-compiler-dev <hotspot-compiler-dev-retn at openjdk.java.net> on behalf of John Rose <john.r.rose at oracle.com>
> Sent: 14 November 2020 2:43
> To: Eric Liu <github.com+10482586+erik1iu at openjdk.java.net>
> Cc: hotspot-compiler-dev at openjdk.java.net <hotspot-compiler-dev at openjdk.java.net>
> Subject: Re: RFR: 8254872: Optimize Rotate on AArch64
>
> On Nov 13, 2020, at 2:39 AM, Eric Liu <github.com+10482586+erik1iu at openjdk.java.net> wrote:
> >
> > This patch is a supplement for
> > https://bugs.openjdk.java.net/browse/JDK-8248830.
> >
> > With the implementation of rotate node in IR, this patch:
> >
> > 1. canonicalizes RotateLeft into RotateRight when shift is a constant,
> > so that GVN could identify the pre-existing node better.
> > 2. implements scalar rotate match rules and removes the original
> > combinations of Or and Shifts on AArch64.
> >
> > This patch doesn't implement vector rotate due to the lack of
> > corresponding vector instructions on AArch64.
> >
> > Test case below is an explanation for this patch.
> >
> > public static int test(int i) {
> > int a = (i >>> 29) | (i << -29);
> > int b = i << 3;
> > int c = i >>> -3;
> > int d = b | c;
> > return a ^ d;
> > }
>
> Because of shift-count masking, this parses to nodes
> equivalent to:
>
> public static int test(int i) {
> int a = (i >>> 29) | (i << 3);
> int d = (i << 3) | (i >>> 29);
> // not detected: a == d
> int r = a ^ d;
> // not detected: r == 0
> return r;
> }
>
> If we were to work a little harder at canonicalizing
> commutative expressions in IGVN, we could detect
> that a==d. (See AddNode::Ideal.) It’s tempting to
> pull on this very long string, but it’s not clear when
> to stop, if not now.
>
> In this case the better road is to canonicalize both
> a and d to the same rotate node. But maybe there’s
> some benefit in reordering x|y|z and x^y^z when
> x and z could combine to a rotate node. (This isn’t
> your problem!)
>
> > Before:
> >
> > lsl w12, w1, #3
> > lsr w10, w1, #29
> > add w11, w10, w12
> > orr w12, w12, w10
> > eor w0, w11, w12
> >
> > After:
> >
> > ror w10, w1, #29
> > eor w0, w10, w10
>
> Amazingly, w10^w10 does not GVN to zero!
I think this was cased by the lack of Ideal on Xor
node. Perhaps we can add some rules for it:
1). x ^ x ==> 0, this can solve the above issue.
2). Const ^ x ==> x ^ Const, so that GVN could replace with
the pre-existed node.
3). x ^ y ==> x, if y is constant zero.
4). x ^ y ==> ~x, if y is constant bit mask value.
>
> Your test appears to rely on that weakness.
> I think the weakness should be fixed in a
> separate investigation.
>
> Anyway, none of these remarks reflects
> on your patch.
>
> — John
Thanks,
Eric
More information about the hotspot-compiler-dev
mailing list