RFR: 8317721: RISC-V: Implement CRC32 intrinsic [v2]
ArsenyBochkarev
duke at openjdk.org
Thu Dec 21 22:20:15 UTC 2023
On Thu, 21 Dec 2023 11:32:11 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> ArsenyBochkarev has updated the pull request incrementally with three additional commits since the last revision:
>>
>> - Use zero_extend instead of shifts where possible
>> - Use andn instead of notr + andr where possible
>> - Replace shNadd with one instruction in most cases
>
> src/hotspot/cpu/riscv/c1_LIRGenerator_riscv.cpp line 775:
>
>> 773:
>> 774: void LIRGenerator::do_update_CRC32(Intrinsic* x) {
>> 775: assert(UseCRC32Intrinsics, "why are we here?");
>
> I suppose the performance data is for C2 intrinsic, is there performance data for C1?
Measured and provided it in a comment below. I used the `-XX:TieredStopAtLevel=1` flag with the same benchmark on the same machine.
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3719:
>
>> 3717: Assembler::lwu(crc, tmp1, 0);
>> 3718:
>> 3719: srli(tmp1, v, 6);
>
> The comment at the beginning is `crc = table3[v&0xff]^table2[(v>>8)&0xff]^table1[(v>>16)&0xff]^table0[v>>24]`, i.e. shifts of v are `8/16/24`, but in the code here shift is `6/14/22` + `bits8 << 2`, is this intended? Can you add some comments here?
It is intended, yes. I just thought that since the table access needs the index to be shifted left by 2 it can be optimized a bit. I left the note in the comments.
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3760:
>
>> 3758: add(table1, table0, 1*256*sizeof(juint), tmp);
>> 3759: add(table2, table0, 2*256*sizeof(juint), tmp);
>> 3760: add(table3, table0, 3*256*sizeof(juint), tmp);
>
> With `add(table3, table2, 256*(junit))`, it might save one instruction.
Good catch, thanks! Fixed.
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3762:
>
>> 3760: add(table3, table0, 3*256*sizeof(juint), tmp);
>> 3761:
>> 3762: bind(L_by16);
>
> seems `L_by16` is not necessary.
Removed.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434555369
PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434553356
PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434553888
PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434554086
More information about the hotspot-compiler-dev
mailing list