RFR: 8317721: RISC-V: Implement CRC32 intrinsic [v2]

ArsenyBochkarev duke at openjdk.org
Thu Dec 21 22:20:15 UTC 2023


On Thu, 21 Dec 2023 15:36:58 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> When I tried `MacroAssembler::lwu` I got the following instructions on T-head:
>> 
>> 0.47%  ?  0x0000003fac6a8738:   li	t3,1
>> 0.51%  ?  0x0000003fac6a873a:   slli	t3,t3,0x20
>> 0.00%  ?  0x0000003fac6a873c:   addi	t3,t3,-1
>> ...
>> 2.68%  ?  0x0000003fac6a8752:   lw	a0,0(t1)
>> 5.25%  ?  0x0000003fac6a8756:   and	a0,a0,t3
>> ...
>>        ?  0x0000003fac6a876a:   lw	t4,0(t1)
>> 1.78%  ?  0x0000003fac6a876e:   and	t1,t4,t3
>> ...
>> 0.49%  ?  0x0000003fac6a8786:   lw	t4,0(t1)
>> 2.62%  ?  0x0000003fac6a878a:   and	t1,t4,t3
>> ...
>> 0.41%  ?  0x0000003fac6a87a2:   lw	t4,0(t1)
>> 3.97%  ?  0x0000003fac6a87a6:   and	t1,t4,t3
>> 
>> instead of just 
>> 
>> 4.52%  ??  0x0000003fb49e96f6:   lwu	a0,0(t1)
>> ...
>>        ??  0x0000003fb49e970a:   lwu	t3,0(t1)
>> ...
>>        ??  0x0000003fb49e9722:   lwu	t3,0(t1)
>> ...
>> 0.02%  ??  0x0000003fb49e973a:   lwu	t3,0(t1)
>
> Interesting, I tried on qemu and `T-HEAD Light Lichee Pi 4A`, I don't get this code generated with just `lwu(tmp2, Address(tmp1, 0));`.
> Do you know how does it happen? I mean how does this happen on a specific hardware only.

Whoops, I tried to reproduce behavior above and it turned out that it was just some old profiling data I collected before implementing version with `lwu` (previously I used the `lw` + `andi` pair of instructions). The `MacroAssembler::lwu` version actually works fine. Thanks everyone for pointing it out!

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17046#discussion_r1434553637


More information about the hotspot-compiler-dev mailing list