RFR: 8373722: [TESTBUG] compiler/vectorapi/TestVectorOperationsWithPartialSize.java fails intermittently

Wed Dec 24 09:23:59 UTC 2025

On Wed, 24 Dec 2025 07:22:50 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> The test fails intermittently with the following error:
>> 
>> 
>> Caused by: java.lang.RuntimeException: assertEqualsWithTolerance: expected 0.0 but was 1.1754945E-38 (tolerance: 1.4E-44, diff: 1.1754945E-38)
>> at compiler.vectorapi.TestVectorOperationsWithPartialSize.verifyAddReductionFloat(TestVectorOperationsWithPartialSize.java:231)
>> at compiler.vectorapi.TestVectorOperationsWithPartialSize.testAddReductionFloat(TestVectorOperationsWithPartialSize.java:260)
>> 
>> 
>> The root cause is that the Vector API `reduceLanes()` does not guarantee a specific calculation order for floating-point reduction operations [1]. When the array contains extreme values, this can produce results outside the tolerance range compared to sequential scalar addition.
>> 
>> For example, given array elements:
>> 
>> [0.0f, Float.MIN_NORMAL, Float.MAX_VALUE, -Float.MAX_VALUE]
>> 
>> 
>> Sequential scalar addition produces:
>> 
>> 0.0f + Float.MIN_NORMAL + Float.MAX_VALUE - Float.MAX_VALUE = 0.0f
>> 
>> 
>> However, `reduceLanes()` might compute:
>> 
>> (0.0f + Float.MIN_NORMAL) + (Float.MAX_VALUE - Float.MAX_VALUE) = Float.MIN_NORMAL
>> 
>> 
>> The difference of the two times of calculation is `Float.MIN_NORMAL` (1.1754945E-38), which exceeds the tolerance of `Math.ulp(0.0f) * 10.0f = 1.4E-44`. Even with a 10x rounding error factor, the tolerance is insufficient for such edge cases.
>> 
>> Since `reduceLanes()` does not require a specific calculation order, differences from scalar results can be significantly larger when special or extreme maximum/minimum values are present. Using a fixed tolerance is inappropriate for such corner cases.
>> 
>> This patch fixes the issue by initializing the float array in test with random normal values within a specified range, ensuring the result gap stays within the defined tolerance. 
>> 
>> Tested locally on my AArch64 and X86_64 machines 500 times, and I didn't observe the failure again.
>> 
>> [1] https://docs.oracle.com/en/java/javase/25/docs/api/jdk.incubator.vector/jdk/incubator/vector/FloatVector.html#reduceLanes(jdk.incubator.vector.VectorOperators.Associative)
>
> test/hotspot/jtreg/compiler/vectorapi/TestVectorOperationsWithPartialSize.java line 80:
> 
>> 78:         random.fill(random.longs(), la);
>> 79:         random.fill(random.uniformFloats(1.0f, 5.0f), fa);
>> 80:         random.fill(random.uniformDoubles(1.0, 5.0), da);
> 
> Ideally our tolerance window should be narrow, and increasing the tolerance range to accomodate outliers as you mentioned in your issue description may defeat the purpose.
> 
> Unlike auto-vectorization which adhears strict ordering JLS semantics, vectorAPI relaxes the reduction order to give backends leeway to use parallel reduction not strictly following the sequential order.
> 
> There are multiple considerations involed, fallback implimentation performs reduction sequentially, inline expander always relaxes the strict ordering, intrinsification of Add/Mul reductions are only supported by Aarch64, X86 and riscv. 
> 
> Computing expected value using parallel reduction can be other alternative but then we may face similar problems on targets which does not intrinsify unordered reductions.
> 
> Tolerance modeling is a complex topic and involves relative and absolute error, current 10ULP absolute limit is not generic enough to handle entier spectrum of values, what you have enforced now is a range based tolerance did you try widening the input value range and confirm if 10ULP tolerance limit is sufficient ?

Yeah, I'm trying to extend the value range to `1~3000`. The tests are still running... Since the result largely depends on the random values,  I run this test `500` times on SVE/NEON/X86 machines respectively (**1500** times totally), and have not observed failure now. Is that fine to you? I will update the test once all tests pass. Thanks for looking at this change!

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28960#discussion_r2645230189