RFR: 8329077: C2 SuperWord: Add MoveD2L, MoveL2D, MoveF2I, MoveI2F [v5]
Galder Zamarreño
galder at openjdk.org
Mon Sep 1 08:51:51 UTC 2025
On Mon, 25 Aug 2025 07:13:43 GMT, Galder Zamarreño <galder at openjdk.org> wrote:
>> I've added support to vectorize `MoveD2L`, `MoveL2D`, `MoveF2I` and `MoveI2F` nodes. The implementation follows a similar pattern to what is done with conversion (`Conv*`) nodes. The tests in `TestCompatibleUseDefTypeSize` have been updated with the new expectations.
>>
>> Also added a JMH benchmark which measures throughput (the higher the number the better) for methods that exercise these nodes. On darwin/aarch64 it shows:
>>
>>
>> Benchmark (seed) (size) Mode Cnt Base Patch Units Diff
>> VectorBitConversion.doubleToLongBits 0 2048 thrpt 8 1168.782 1157.717 ops/ms -1%
>> VectorBitConversion.doubleToRawLongBits 0 2048 thrpt 8 3999.387 7353.936 ops/ms +83%
>> VectorBitConversion.floatToIntBits 0 2048 thrpt 8 1200.338 1188.206 ops/ms -1%
>> VectorBitConversion.floatToRawIntBits 0 2048 thrpt 8 4058.248 14792.474 ops/ms +264%
>> VectorBitConversion.intBitsToFloat 0 2048 thrpt 8 3050.313 14984.246 ops/ms +391%
>> VectorBitConversion.longBitsToDouble 0 2048 thrpt 8 3022.691 7379.360 ops/ms +144%
>>
>>
>> The improvements observed are a result of vectorization. The lack of vectorization in `doubleToLongBits` and `floatToIntBits` demonstrates that these changes do not affect their performance. These methods do not vectorize because of flow control.
>>
>> I've run the tier1-3 tests on linux/aarch64 and didn't observe any regressions.
>
> Galder Zamarreño has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 22 additional commits since the last revision:
>
> - Merge branch 'master' into topic.fp-bits-vector
> - Add more IR node positive assertions
> - Fix source of data for benchmarks
> - Refactor benchmarks to TypeVectorOperations
> - Check at the very least that auto vectorization is supported
> - Avoid VectorReinterpret::implemented
> - Refactor and add copyright header
> - Rephrase comment
> - Removed unnecessary assert methods
> - Adjust IR test after adding Move* vector support
> - ... and 12 more: https://git.openjdk.org/jdk/compare/54d7c4b3...e7e4d801
One correction about my suggested fix above:
This one would work for `UseAVX=1` but would fail with other `UseAVX` values.
@IR(counts = {IRNode.LOAD_VECTOR_F, IRNode.VECTOR_SIZE_4, "> 0",
It would need to be something like this to work in all cases:
@IR(counts = {IRNode.LOAD_VECTOR_F, IRNode.VECTOR_SIZE_ANY, "> 0",
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26457#issuecomment-3241465858
More information about the core-libs-dev
mailing list