RFR: 8366588: VectorAPI: Re-intrinsify VectorMask.laneIsSet where the input index is a variable

Emanuel Peter epeter at openjdk.org
Tue Sep 9 08:51:21 UTC 2025


On Fri, 5 Sep 2025 08:13:28 GMT, erifan <duke at openjdk.org> wrote:

> Intrinsic support for `VectorMask.laneIsSet` with a **variable** input index was introduced in PR #14200, but was inadvertently broken by PR #25673. This PR restores the intrinsic functionality and adds some JTReg tests.
> 
> Benchmarks on Nvidia Grace machine with 128-bit SVE:
> 
> Benchmark			            Unit	Before		Score Error	After		Score Error	Uplift
> microMaskLaneIsSetByte128_var	ops/ms	21702.14415	91.902159	103472.9391	36.057447	4.767867
> microMaskLaneIsSetByte64_var	ops/ms	21468.51868	107.94177	103365.6561	69.47736	4.814754
> microMaskLaneIsSetDouble128_var	ops/ms	77489.32791	153.242699	413499.4127	311.854079	5.336211
> microMaskLaneIsSetFloat128_var	ops/ms	41034.95204	399.421823	206840.0988	74.702234	5.040583
> microMaskLaneIsSetFloat64_var	ops/ms	77607.40268	175.938921	413745.3001	149.716794	5.33126
> microMaskLaneIsSetInt128_var	ops/ms	41452.48893	76.143208	206845.9754	59.371129	4.989953
> microMaskLaneIsSetInt64_var	    ops/ms	77726.2542	173.180518	413427.8838	363.575023	5.319024
> microMaskLaneIsSetLong128_var	ops/ms	77646.11218	177.496587	413403.4404	236.609314	5.3242
> microMaskLaneIsSetShort128_var	ops/ms	21374.93265	48.13101	103417.4618	34.827021	4.838259
> microMaskLaneIsSetShort64_var	ops/ms	41066.19395	353.320621	206801.109	106.408938	5.035799
> 
> 
> Benchmarks on Intel 6444y machine with 512-bit avx3:
> 
> Benchmark			            Unit	Before		Score Error	After		Score Error	Uplift
> microMaskLaneIsSetByte128_var	ops/ms	57658.45497	240.209309	211643.8406	29.214532	3.670647
> microMaskLaneIsSetByte256_var	ops/ms	57451.68169	116.994128	211609.4652	160.48513	3.683259
> microMaskLaneIsSetByte512_var	ops/ms	57530.22411	311.63868	199802.8084	408.144015	3.473005
> microMaskLaneIsSetByte64_var	ops/ms	57642.2672	161.406221	205252.4464	196.86852	3.560797
> microMaskLaneIsSetDouble256_var	ops/ms	114401.3789	231.797375	361400.344	565.593984	3.159055
> microMaskLaneIsSetDouble512_var	ops/ms	57379.27882	159.699503	211476.1138	136.980026	3.685583
> microMaskLaneIsSetFloat128_var	ops/ms	113943.9512	141.062663	360855.3915	494.471996	3.166955
> microMaskLaneIsSetFloat256_var	ops/ms	57682.78182	138.142053	211659.5098	30.167972	3.66937
> microMaskLaneIsSetFloat512_var	ops/ms	57617.66405	301.748599	211246.8588	597.18949	3.666355
> microMaskLaneIsSetInt128_var	ops/ms	113914.5062	118.681382	360856.4465	555.097397	3.167783
> microMaskLaneIsSetInt256_var	ops/ms	57681.79883	112.391639	211555.6742	217.556981	3.667633
> microMaskLaneIsSetInt512_var	ops/ms	57350.20346	206.146723	211657.7207	68.461571	3.690618
> microMaskLane...

The patch looks reasonable, thanks for fixing this and writing an IR test!
I'm launching some internal testing now, should hopefully not take much more than 24h.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27113#issuecomment-3269585613


More information about the core-libs-dev mailing list