RFR: 8366333: AArch64: Enhance SVE subword type implementation of vector compress
Emanuel Peter
epeter at openjdk.org
Thu Sep 18 12:58:55 UTC 2025
On Mon, 15 Sep 2025 09:58:19 GMT, erifan <duke at openjdk.org> wrote:
>> Would it make sense to additionally run the relevant benchmarks on other popular aarch64 platforms such as Graviton, to make sure the improvements are seen there as well?
>
> @galderz Yeah, absolutely. This is the test results on an **AWS graviton3 V1 machine**, we can see similar performance gain.
>
> <html xmlns:v="urn:schemas-microsoft-com:vml"
> xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns="http://www.w3.org/TR/REC-html40">
>
> <head>
>
> <meta name=ProgId content=Excel.Sheet>
> <meta name=Generator content="Microsoft Excel 15">
> <link id=Main-File rel=Main-File
> href="file:////Users/erfang/Library/Group%20Containers/UBF8T346G9.Office/TemporaryItems/msohtmlclip/clip.htm">
> <link rel=File-List
> href="file:////Users/erfang/Library/Group%20Containers/UBF8T346G9.Office/TemporaryItems/msohtmlclip/clip_filelist.xml">
>
>
>
> </head>
>
> <body link="#467886" vlink="#96607D">
>
>
> Benchmark | Units | Before | Error | After | Error | Uplift
> -- | -- | -- | -- | -- | -- | --
> Byte128Vector.compress | ops/ms | 2405.511 | 0.763 | 6116.85 | 17.699 | 2.54284848
> Byte64Vector.compress | ops/ms | 1151.662 | 11.262 | 5278.924 | 6.74 | 4.58374419
> Double128Vector.compress | ops/ms | 4919.017 | 4.909 | 4940.232 | 20.143 | 1.00431285
> Double64Vector.compress | ops/ms | 37.071 | 0.778 | 37.109 | 0.945 | 1.00102506
> Float128Vector.compress | ops/ms | 9580.312 | 48.341 | 9586.499 | 74.934 | 1.0006458
> Float64Vector.compress | ops/ms | 4943.728 | 7.361 | 4941.917 | 5.871 | 0.99963368
> Int128Vector.compress | ops/ms | 9496.991 | 34.972 | 9515.122 | 29.204 | 1.00190913
> Int64Vector.compress | ops/ms | 4940.23 | 7.141 | 4941.815 | 5.077 | 1.00032084
> Long128Vector.compress | ops/ms | 4918.142 | 14.835 | 4917.148 | 9.05 | 0.99979789
> Long64Vector.compress | ops/ms | 36.58 | 0.426 | 36.574 | 0.431 | 0.99983598
> Short128Vector.compress | ops/ms | 3343.878 | 0.898 | 6813.421 | 4.143 | 2.03758062
> Short64Vector.compress | ops/ms | 1595.358 | 3.37 | 3390.959 | 3.55 | 2.12551603
>
>
>
> </body>
>
> </html>
@erifan I'm going to be out of the office for 3 weeks, so feel free to ask others for reviews :)
-------------
PR Comment: https://git.openjdk.org/jdk/pull/27188#issuecomment-3307304775
More information about the hotspot-compiler-dev
mailing list