RFR: 8255553: [PPC64] Introduce and use setbc and setnbc P10 instructions [v3]
Martin Doerr
mdoerr at openjdk.java.net
Mon Nov 2 10:09:00 UTC 2020
On Sat, 31 Oct 2020 05:02:06 GMT, Ziviani <github.com+670087+jrziviani at openjdk.org> wrote:
>> - setbc RT,BI: sets RT to 1 if CR(BI) is 1, otherwise 0.
>> - setnbc RT,BI: sets RT to -1 if CR(BI) is 1, otherwise 0.
>> Ref: PowerISA 3.1, page 129.
>>
>> These instructions are particularly interesting to improve the following
>> pattern `(src1<src2)? -1: ((src1>src2)? 1: 0)`, which can be found in
>> `instruct cmpL3_reg_reg_ExEx()@ppc.ad`, by removing its branches.
>>
>> Long.toString, that generate such pattern in getChars, has showed a
>> good performance gain by using these new instructions.
>>
>> Example:
>> for (int i = 0; i < 200_000; i++)
>> res = Long.toString((long)i);
>>
>> java -Xcomp -XX:CompileThreshold=1 -XX:-TieredCompilation TestToString
>>
>> Without setbc (average): 0.1178 seconds
>> With setbc (average): 0.0396 seconds
>
> Ziviani has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR.
Thanks for doing this. Please check my inline comments.
If you would like to benchmark C1, you can use -XX:TieredStopAtLevel=1 to switch off C2.
When you factor the new logic out, I highly prefer to use it everywhere: C2, C1 (LIR_Assembler::comp_fl2i), interpreter (TemplateTable::lcmp, TemplateTable::float_cmp)
src/hotspot/cpu/ppc/ppc.ad line 11422:
> 11420:
> 11421: // Manifest a CmpL3 result in an integer register.
> 11422: instruct cmpL3_reg_reg_Ex(iRegIdst dst, iRegLsrc src1, iRegLsrc src2) %{
"_Ex" should be removed since it doesn't use exapand any more.
src/hotspot/cpu/ppc/ppc.ad line 11425:
> 11423: match(Set dst (CmpL3 src1 src2));
> 11424: ins_cost(DEFAULT_COST*5);
> 11425: size(20);
VM_Version::has_brw() ? 16 : 20
src/hotspot/cpu/ppc/ppc.ad line 11427:
> 11425: size(20);
> 11426:
> 11427: format %{ "cmpL3_reg_reg_Ex $dst, $src1, $src2" %}
"_Ex" should be removed since it doesn't use exapand any more.
src/hotspot/cpu/ppc/ppc.ad line 11441:
> 11439: __ srawi(R0, R0, 31);
> 11440: }
> 11441: __ orr($dst$$Register, $dst$$Register, R0);
Better factor this out to macroAssembler. E.g. MacroAssembler::set_cmp3(Register dst); // set dst to -1, 0, +1 depending on CR0
src/hotspot/cpu/ppc/ppc.ad line 11766:
> 11764:
> 11765: // Compare float, generate -1,0,1
> 11766: instruct cmpF3_reg_reg_Ex(iRegIdst dst, regF src1, regF src2) %{
"_Ex" should be removed since it doesn't use exapand any more.
src/hotspot/cpu/ppc/ppc.ad line 11859:
> 11857:
> 11858: // Compare double, generate -1,0,1
> 11859: instruct cmpD3_reg_reg_Ex(iRegIdst dst, regD src1, regD src2) %{
"_Ex" should be removed since it doesn't use exapand any more.
src/hotspot/cpu/ppc/ppc.ad line 11864:
> 11862: size(20);
> 11863:
> 11864: format %{ "cmpD3_reg_reg_Ex $dst, $src1, $src2" %}
"_Ex" should be removed since it doesn't use exapand any more.
src/hotspot/cpu/ppc/ppc.ad line 11862:
> 11860: match(Set dst (CmpD3 src1 src2));
> 11861: ins_cost(DEFAULT_COST*5);
> 11862: size(20);
VM_Version::has_brw() ? 16 : 20
src/hotspot/cpu/ppc/ppc.ad line 11769:
> 11767: match(Set dst (CmpF3 src1 src2));
> 11768: ins_cost(DEFAULT_COST*5);
> 11769: size(20);
VM_Version::has_brw() ? 16 : 20
src/hotspot/cpu/ppc/ppc.ad line 11771:
> 11769: size(20);
> 11770:
> 11771: format %{ "cmpF3_reg_reg_Ex $dst, $src1, $src2" %}
"_Ex" should be removed since it doesn't use exapand any more.
-------------
Changes requested by mdoerr (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/907
More information about the hotspot-compiler-dev
mailing list