[aarch64-port-dev ] [aarch64-port-dev] [10] RFR: 8184049 : Matching rule for ubfiz
Daniel Stewart
daniel.stewart at linaro.org
Fri Jul 28 12:21:54 UTC 2017
The latest test routines I use:
public class Test {
static int ia[] = {0xabcd1234, 1};
static long la[] = {0xabcd1234abcd1234L, 1};
static short sa[] = {(short)0xabcd1234, 1};
public static int testI() {
return (ia[0] & 0xf) << 3; // ubfiz wa,wb, #3, #4
}
public static int testI2() {
return (ia[0] & 0xf) << 32; // Shouldn't use ubfiz because shift >
31, *NOT* ubfiz wa,wb, #1, #4
}
public static int testI3() {
return (ia[0] & 0x8) << 5; // neg, Shouldn't use ubfiz because
lower bits are not contiguous, *NOT* ubfiz wa,wb, #5, #4
}
public static int testI4() {
return (ia[0] & 0x7) << 3; // ubfiz wa, wb, #3, #3
}
public static int testI5() {
return(ia[0] & 0x7fff) << 16; // ubfiz wa,wb, #16, #15
}
public static int testI6() {
return (ia[0] & 0x7ff) << 21; // Shouldn't use
ubfiz. No AND node is present as we'd be AND'ing with the only bits that
are left after the shift.
}
public static int testI7() {
return (ia[0] & 0x3ff) << 21; // ubfiz wa, wb, #21, #10
}
public static long testL() {
return (la[0] & 0xf) << 2; // ubfiz xa, xb, #2, #4
}
public static long testL2() {
return (la[0] & 0xf) << 32; // ubfiz xa, xb, #32, #4
}
public static long testL3() {
return (la[0] & 0xfff) << 48; // ubfiz xa, xb, #48, #12
}
public static long testL4() {
return (la[0] & 0xffff) << 48; // Shouldn't match because there is
no need for an ANDL or ANDI in the IR graph when masking 16-bits, just a
LOADL. *NOT* ubfiz xa, xb, #48, #16
}
public static long testL5() {
return (la[0] & 0xffffff) << 17; // ubfiz wa, wb,
#17, #24
}
public static long testConv() {
return ((long)(ia[0] & 0x3)) << 3; // ubfiz wa, wb, #3, #2
}
public static short testConv2() {
return (short)((sa[0] & 0x7) << 4); // ubfiz wa, wb, #4, #3
}
public static long testConv3() {
return (((long)ia[0]) & 0x1f) << 3; // ubfiz w0, w0, #3, #5
}
public static long testConv4() {
return ((long)(ia[0] & 0x3ff)) << 22; // ubfiz wa, wb,
#22, #10
}
public static void main(String [] args) {
long sum = 0;
for (int i = 0; i < 10000000; i++) {
//System.out.println((int)Math.random());
sum += testI();
sum += testI2();
sum += testI3();
sum += testI4();
sum += testI5();
sum += testI6();
sum += testI7();
sum += testL();
sum += testL2();
sum += testL3();
sum += testL4();
sum += testL5();
sum += testConv();
sum += testConv2();
sum += testConv3();
sum += testConv4();
}
System.out.println(sum);
}
}
On Fri, Jul 28, 2017 at 8:20 AM, Daniel Stewart <daniel.stewart at linaro.org>
wrote:
> New Webrev (Thanks Ningsheng!)
>
> http://cr.openjdk.java.net/~njian/8184049/webrev.01/
>
> Please review.
>
> Thank you,
> Daniel
>
> On Thu, Jul 27, 2017 at 4:49 PM, Daniel Stewart <daniel.stewart at linaro.org
> > wrote:
>
>> I was able to get Andrew's test file to work as well, and it seems to be
>> producing the ubfiz instructions in a loop. I did modify his code to
>> ensure that the values passed to the test functions match that which is
>> expected (int, long, short). The modified run() function is below. I've
>> created a new patch and it should be posted in the next few hours.
>>
>> public void run(String [] args) {
>> long sum = this.sum | (1 << 27);
>> int n = (int)sum;
>> n = xorshift32(n);
>> for (int i = 0; i < 1000; i++) {
>> //System.out.println((int)Math.random());
>> n += testI(n);
>> n += testI2(n);
>> n += testI3(n);
>> n += testI4(n);
>> n += testI5(n);
>> }
>> long n1 = (long)n;
>> for (int i = 0; i < 1000; i++) {
>> n1 += testL(n1);
>> n1 += testL2(n1);
>> n1 += testL3(n1);
>> n1 += testL4(n1);
>> n1 += testConv(n1);
>> }
>> short n2 = (short)n;
>> for (int i = 0; i < 1000; i++) {
>> n2 += testConv2((short)n2);
>> }
>> n1 += n2;
>> for (int i = 0; i< 1000; i++) {
>> n1 += testConv3(n);
>> }
>> this.sum += sum ^ n1;
>>
>> Daniel
>>
>>
>>
>> On Thu, Jul 27, 2017 at 11:13 AM, Daniel Stewart <
>> daniel.stewart at linaro.org> wrote:
>>
>>> I'm preparing another patch right now. The issue Felix uncovered is
>>> because the number of bits masked off + the shift amount is greater than
>>> 32. Instead of just lopping off the bits that would shifted, this winds up
>>> looking like a ubfx with the wrong bits masked off. I'm updating the
>>> Predicate to catch this case. It doesn't appear to be a problem in the ubfx
>>> case, as the AND'ing of the bits appears to be dropped and so the match for
>>> ubfx is never even tried.
>>>
>>> Daniel
>>>
>>> On Thu, Jul 27, 2017 at 10:05 AM, Andrew Haley <aph at redhat.com> wrote:
>>>
>>>> On 25/07/17 14:35, Felix Yang wrote:
>>>> > I tried to modify the test case changing testI2() into:
>>>> >
>>>> > public static int testI2() {
>>>> > return (ia[0] & 0xf) << 30;
>>>> > }
>>>> >
>>>> > Then I got different execution results on aarch64 and x86:
>>>> >
>>>> > aarch64:
>>>> > java -XX:-TieredCompilation Test
>>>> > 2758214541841904631
>>>> >
>>>> > x86:
>>>> > java -XX:-TieredCompilation Test
>>>> > 2758195853365405696
>>>>
>>>> Hmm. I'm not seeing that problem. But on the other had, I'm not seeing
>>>> the intrinsics used much either: in fact, they seem to be used only once
>>>> and are not used in the loop at all.
>>>>
>>>> I've been using the test at
>>>> http://cr.openjdk.java.net/~aph/8184049/Test.java
>>>>
>>>> --
>>>> Andrew Haley
>>>> Java Platform Lead Engineer
>>>> Red Hat UK Ltd. <https://www.redhat.com>
>>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>>>
>>>
>>>
>>>
>>> --
>>> Daniel Stewart
>>>
>>
>>
>>
>> --
>> Daniel Stewart
>>
>
>
>
> --
> Daniel Stewart
>
--
Daniel Stewart
More information about the aarch64-port-dev
mailing list