RFR(s): PPC64: Use andis instead of lis/and
Doerr, Martin
martin.doerr at sap.com
Fri Nov 25 10:28:49 UTC 2016
Hi Bruno,
thank you very much for providing the webrev.
Please note that andis needs an unsigned 16 bit immediate (unlike addis).
The type conversion is wrong: (int)((short)...
Zero extend is needed instead of sign extend.
Using the operand immIhi16 should be ok in this case because we don't care about the high 32 bit of the register.
Besides that, the change looks good to me.
Please use the bug number for RFR's in the future:
RFR(S): 8170328 PPC64: Use andis instead of lis/and
Thanks and best regards,
Martin
-----Original Message-----
From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of Bruno Alexandre Rosa
Sent: Donnerstag, 24. November 2016 21:53
To: Igor Henrique Soares Nunes <igor.nunes at eldorado.org.br>; hotspot-compiler-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
Subject: RE: RFR(s): PPC64: Use andis instead of lis/and
Igor forgot to post the web-rev, so I'm posting it here for him:
https://igorsnunes.github.io/openjdk/webrev/8170328/
Regards,
Bruno Rosa
-----Original Message-----
From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net] On Behalf Of Igor Henrique Soares Nunes
Sent: quinta-feira, 24 de novembro de 2016 18:36
To: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net
Subject: RFR(s): PPC64: Use andis instead of lis/and
Hi all,
The following rev solves an improvement suggested by Gustavo Romero
(http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2016-October/024664.html):
https://bugs.openjdk.java.net/browse/JDK-8170328
The issue is:
Use andis. in place of following sequence.
845 1.2e-04 : 3fff6027e134: lis r18,7
36054 0.0052 : 3fff6027e138: and r18,r17,r18
This patch resulted in small improvements in the Opto Assembly dumped code.
See the explanation below:
Situation 1.1)
03c B3: # B7 B4 <- B2 Freq: 0.899982 03c LIS R15, #133955584.hi
040 AND R14, R3, R15
044 CMPW CCR6, R14, R15
048 Beq CCR6, B7 P=0.100000 C=-1.000000
Situation 1.2)
03c B3: # B7 B4 <- B2 Freq: 0.899982
03c ANDIS R15, R3, #133955584.hi
040 LIS R17, #133955584.hi
044 CMPW CCR5, R15, R17
048 Beq CCR5, B7 P=0.100000 C=-1.000000
Situation 2.1)
370 B91: # B392 B92 <- B90 Freq: 0.000197734
370 LIS R14, #251658240.hi
374 AND R15, R3, R14
378 LIS R17, #16777216.hi
37c CMPW CCR5, R15, R17
380 Beq CCR5, B392 P=0.100000 C=-1.000000
Situation 2.2)
370 B91: # B392 B92 <- B90 Freq: 0.000197734
370 ANDIS R15, R3, #251658240.hi
374 LIS R14, #16777216.hi
378 CMPW CCR6, R15, R14
37c Beq CCR6, B392 P=0.100000 C=-1.000000
In situations 1.1 and 2.1 the patch is not applied. In 1.2 and 2.2 the patch is applied.
Comparing 2.1 and 2.2 some performance gain is seen, as one less instruction is needed.
Comparing 1.1 and 1.2, no performance gain is seen. In 1.1 the value loaded in R15 is used in AND and CMPW (no reload).
In 1.2, the ANDIS operation is executed first, so that, no register reuse is made.
Att.,
Igor Nunes
More information about the ppc-aix-port-dev
mailing list