RFR: 8329077: C2 SuperWord: Add MoveD2L, MoveL2D, MoveF2I, MoveI2F [v3]
Galder Zamarreño
galder at openjdk.org
Fri Aug 22 11:40:10 UTC 2025
On Wed, 20 Aug 2025 06:52:47 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Galder Zamarreño has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Check at the very least that auto vectorization is supported
>
> Had a quick look again and found a few more suggestions in the tests/benchmarks.
> But I think the VM changes are solid :)
@eme64 I've refactored the benchmarks to `TypeVectorOperations`. These are the before/after throughput numbers on darwin/aarch64:
Benchmark (COUNT) (seed) Mode Cnt Base Patch Units Diff
TypeVectorOperations.TypeVectorOperationsSuperWord.convertD2LBits 512 0 thrpt 8 4993.941 5127.876 ops/ms +3%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertD2LBits 2048 0 thrpt 8 1169.952 1179.016 ops/ms +1%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertD2LBitsRaw 512 0 thrpt 8 15394.034 27658.958 ops/ms +80%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertD2LBitsRaw 2048 0 thrpt 8 4007.795 7347.348 ops/ms +83%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertF2IBits 512 0 thrpt 8 5140.632 5214.131 ops/ms +1%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertF2IBits 2048 0 thrpt 8 1187.033 1130.995 ops/ms -5%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertF2IBitsRaw 512 0 thrpt 8 15874.272 54196.086 ops/ms +241%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertF2IBitsRaw 2048 0 thrpt 8 4020.074 15020.595 ops/ms +274%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertIBits2F 512 0 thrpt 8 12008.101 53389.533 ops/ms +345%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertIBits2F 2048 0 thrpt 8 3010.701 15001.785 ops/ms +398%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertLBits2D 512 0 thrpt 8 11947.581 28216.125 ops/ms +136%
TypeVectorOperations.TypeVectorOperationsSuperWord.convertLBits2D 2048 0 thrpt 8 2992.392 7354.876 ops/ms +146%
I've added added more positive IR node checks to `TestCompatibleUseDefTypeSize`.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26457#issuecomment-3214062842
More information about the hotspot-compiler-dev
mailing list