RFR: 8334554: RISC-V: verify & fix perf of string comparison
Hamlin Li
mli at openjdk.org
Mon Jun 24 14:46:13 UTC 2024
On Mon, 24 Jun 2024 14:40:10 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2330:
>>
>>> 2328: void C2_MacroAssembler::element_compare(Register a1, Register a2, Register result, Register cnt, Register tmp1, Register tmp2,
>>> 2329: VectorRegister vr1, VectorRegister vr2, VectorRegister vrs, bool islatin, Label &DONE,
>>> 2330: bool is_m2) {
>>
>> How about add a `Assembler::LMUL LMUL` param instead? And, should we pass a larger `Assembler::m4` only for vlen=128 case (that is when `MaxVectorSize` is 16)? As I mentioned on [[1]](https://github.com/openjdk/jdk/pull/18382#discussion_r1645356197), a LMUL larger than needed can sometimes even bring a negative impact on performance on hardwares like banana-pi (vlen=256), which is kind of strange to me.
>>
>> Performance impact on banana-pi (vlen=256):
>> Before:
>>
>> Benchmark (delta) (size) Mode Cnt Score Error Units
>> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 4556.938 ± 909.960 us/op
>> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 4613.250 ± 891.120 us/op
>> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 5792.938 ± 545.470 us/op
>> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 5884.248 ± 1089.558 us/op
>> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 8506.465 ± 197.376 us/op
>> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 14349.963 ± 253.898 us/op
>> StringCompareToDifferentLength.compareToLU 2 24 avgt 9 6084.199 ± 5148.464 us/op
>> StringCompareToDifferentLength.compareToLU 2 36 avgt 9 5194.196 ± 927.611 us/op
>> StringCompareToDifferentLength.compareToLU 2 72 avgt 9 7332.861 ± 909.214 us/op
>> StringCompareToDifferentLength.compareToLU 2 128 avgt 9 7043.723 ± 159.843 us/op
>> StringCompareToDifferentLength.compareToLU 2 256 avgt 9 11718.996 ± 552.570 us/op
>> StringCompareToDifferentLength.compareToLU 2 512 avgt 9 20471.987 ± 314.224 us/op
>> StringCompareToDifferentLength.compareToUL 2 24 avgt 9 5371.997 ± 1002.623 us/op
>> StringCompareToDifferentLength.compareToUL 2 36 avgt 9 5469.605 ± 1119.210 us/op
>> StringCompareToDifferentLength.compareToUL 2 72 avgt 9 7249.683 ± 154.028 ...
>
> Seems the `Error` column is huge for tests `compareToLL`.
This is not a good news us (riscv), as we need to adjust the lmul for specific intrinsic, I'm not sure if different boards with same vector length will have impact on the selection of lmul value.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19825#discussion_r1651162840
More information about the hotspot-compiler-dev
mailing list