RFR: 8320999: RISC-V: C2 RotateLeftV [v2]

Fri May 24 14:57:04 UTC 2024

On Fri, 24 May 2024 14:48:24 GMT, Fei Yang <fyang at openjdk.org> wrote:

>> Not sure, could be. If this is the case, then the vecotr shift should be optimized too?
>> 
>> I check the code generated, seems we're fine?
>> 
>>   0x00002aaac560c55a:   vmv.v.x v1,a3
>>   ... ...
>>   0x00002aaac560c594:   vle32.v v2,(a4)
>>   0x00002aaac560c598:   vsetivli        t0,8,e32,m1,tu,mu
>>   0x00002aaac560c59c:   vror.vv v2,v2,v1
>> 
>> 
>> In any way, we need 2 v register's?
>
> Yes, I think there should be quite a few places where we could make use of vector-scalar variants, which would save us one vector register. @zifeihan has already handle some cases in vector logic instructions: https://github.com/openjdk/jdk/pull/18999. And He is currently working on handling more vector arithmetic instructions.
> 
> (One example: https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/riscv_v.ad#L523)

I would also favor using `.vi` or `.vx` variants over `.vv` variants where possible. This would reduce the vector register pressure and remove an unnecessary instruction.

@Hamlin-Li  in your example, we could instead have:

  ... ...
  0x00002aaac560c594:   vle32.v v2,(a4)
  0x00002aaac560c598:   vsetivli        t0,8,e32,m1,tu,mu
  0x00002aaac560c59c:   vror.vx v2,v2,a3

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19325#discussion_r1613609262