RFR: 8317721: RISC-V: Implement CRC32 intrinsic [v2]

ArsenyBochkarev duke at openjdk.org
Thu Dec 21 22:20:15 UTC 2023


On Thu, 21 Dec 2023 11:32:11 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> ArsenyBochkarev has updated the pull request incrementally with three additional commits since the last revision:
>> 
>>  - Use zero_extend instead of shifts where possible
>>  - Use andn instead of notr + andr where possible
>>  - Replace shNadd with one instruction in most cases
>
> src/hotspot/cpu/riscv/c1_LIRGenerator_riscv.cpp line 775:
> 
>> 773: 
>> 774: void LIRGenerator::do_update_CRC32(Intrinsic* x) {
>> 775:   assert(UseCRC32Intrinsics, "why are we here?");
> 
> I suppose the performance data is for C2 intrinsic, is there performance data for C1?

Measured and provided it in a comment below. I used the `-XX:TieredStopAtLevel=1` flag with the same benchmark on the same machine.

> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3719:
> 
>> 3717:   Assembler::lwu(crc, tmp1, 0);
>> 3718: 
>> 3719:   srli(tmp1, v, 6);
> 
> The comment at the beginning is `crc = table3[v&0xff]^table2[(v>>8)&0xff]^table1[(v>>16)&0xff]^table0[v>>24]`, i.e. shifts of v are `8/16/24`, but in the code here shift is `6/14/22` + `bits8 << 2`, is this intended? Can you add some comments here?

It is intended, yes. I just thought that since the table access needs the index to be shifted left by 2 it can be optimized a bit. I left the note in the comments.

> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3760:
> 
>> 3758:   add(table1, table0, 1*256*sizeof(juint), tmp);
>> 3759:   add(table2, table0, 2*256*sizeof(juint), tmp);
>> 3760:   add(table3, table0, 3*256*sizeof(juint), tmp);
> 
> With `add(table3, table2, 256*(junit))`, it might save one instruction.

Good catch, thanks! Fixed.

> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3762:
> 
>> 3760:   add(table3, table0, 3*256*sizeof(juint), tmp);
>> 3761: 
>> 3762:   bind(L_by16);
> 
> seems `L_by16` is not necessary.

Removed.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434555369
PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434553356
PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434553888
PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434554086


More information about the hotspot-compiler-dev mailing list