RFR: 8347981: RISC-V: Add Zfa zli imm loads [v2]
Robbin Ehn
rehn at openjdk.org
Tue Jan 21 08:45:36 UTC 2025
On Sat, 18 Jan 2025 09:14:17 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
>> Hi, please consider!
>>
>> This patch the Zfa zli floating point immediate load.
>> As a bonus it adds fmv_?_x(Rd, zr); for loading fp/dp 0x0.
>> There are some more instruction in Zfa, but this was such a clear use-case so I only did fli as a start.
>>
>> When using one of the 32 'popular' floats we can now materialze them without a load.
>> E.g.
>> `float f = f1 * 0.5 + f2 * 2.0;`
>> Only require 2 loads instead of 4: as '0.5' and '2.0' is such popular float values.
>>
>> As Java is often memory bound we should also investigate doing lui+ssli+fmv for float/doubles instead of a load when materializing.
>>
>> Note the _fli_s/_fli_d will be proper merged on the 8347794: RISC-V: Add Zfhmin - Float cleanup.
>>
>> Passes:
>> ./test/jdk/java/lang/Math
>> ./test/hotspot/jtreg/compiler/floatingpoint/
>> ./test/jdk/java/util/Formatter/
>> ./test/jdk/java/lang/Float/
>> ./test/jdk/java/lang/Double/
>> ./test/hotspot/jtreg/compiler/c2/FloatingPointFoldingTest.java
>> ./test/hotspot/jtreg/compiler/eliminateAutobox/
>> ./test/hotspot/jtreg/vmTestbase/jit/
>>
>> Running tier1
>>
>> Thanks!
>
> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>
> - Flip bool static decl
> - Merge branch 'master' into zfa
> - Baseline
It's seem you made a non-review comment above, I can't reply.
**can_zfa_zli_float()** method only checks if we can use a zli for that imm.
**can_fp_imm_load()** checks if we can do something better than loading the imm.
Right now we have two cases:
- imm == 0-bits sets
- imm == zli Rs pattern
As many application are memory bound and loading may miss L1.
This means we want to do plain materializing if we can't more or less guarantee L1 hit and the load do not stall other loads.
Therefore we should add more cases to above, for exampel:
lui + slli + fmw
The callsite is asking do I need to load from an address using fld/flw or can I materialize.
So I prefer the callsite to be unaware that in one of the cases we map an imm to a bit pattern in Rs.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23171#issuecomment-2603987527
More information about the hotspot-dev
mailing list