RFR: 8311862: RISC-V: small improvements to slli [v2]

Fei Yang fyang at openjdk.org
Fri Jul 14 03:23:56 UTC 2023


On Wed, 12 Jul 2023 07:35:20 GMT, Ilya Gavrilin <duke at openjdk.org> wrote:

>> Please review this small change for slli and slli.uw
>> slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs)
>> addi with 0 has higher chances to be just a register renaming and not utilise ALU at all
>> We have observed small positive effect on hifive (and no change on thead).
>> Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available
>> testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba
>> 
>> performance on hifive, before:
>> |              Benchmark              | Mode | Cnt |   Score  |   |  Error  | Units |
>> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:|
>> | StringIndexOf.advancedWithShortSub1 | avgt |  25 | 4035.143 | ± | 191.262 | ns/op |
>> | StringIndexOf.advancedWithShortSub2 | avgt |  25 | 1145.807 | ± |  14.610 | ns/op |
>> 
>> with patch:
>> |              Benchmark              | Mode | Cnt |   Score  |   |  Error  | Units |
>> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:|
>> | StringIndexOf.advancedWithShortSub1 | avgt |  25 | 3613.943 | ± | 178.153 | ns/op |
>> | StringIndexOf.advancedWithShortSub2 | avgt |  25 | 923.169 | ± |  47.123| ns/op |
>
> Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix typo in macroAssembler_riscv.cpp

Changes requested by fyang (Reviewer).

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4343:

> 4341:   } else {
> 4342:     slli(Rd, Rs, shamt + 32);
> 4343:     srli(Rd, Rd, 32);

I don't think this code in else block will work in the case when `shamt` >= 32. Note that the `slli.uw` instruction is the same as `slli` with `zext.w` performed on the least-significant word of `Rs` before shifting. So you might want to do a combination of 32-bit zero extension and `slli` on `Rs` instead.

-------------

PR Review: https://git.openjdk.org/jdk/pull/14823#pullrequestreview-1529534808
PR Review Comment: https://git.openjdk.org/jdk/pull/14823#discussion_r1263249121


More information about the hotspot-dev mailing list