Potential FloatVector.fromArray miscompilation when using index map on AVX-512 machine

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Thu Oct 5 03:07:40 UTC 2023


> I wonder if this might be relevant:
> 
> https://bugs.openjdk.org/browse/JDK-8317121

No, these bugs are unrelated. What Joel describes is an issue with 
instruction encoding (wrong register ends up encoded in the vgatherdps), 
while JDK-8317121 happens during C2 IR transformations where a vector 
load erroneously may erroneously bypass a dominating array copy.

Best regards,
Vladimir Ivanov

> On 5/10/2023 5:07 am, Joel Knighton wrote:
>> Hello,
>> I have a very repeatable JVM crash when using the Panama Vector API 
>> that appears to be due to improper assembly emitted when compiling 
>> FloatVector.fromArray in certain circumstances. This crash occurs on 
>> (at least) OpenJDK 20 (OpenJDK Runtime Environment Temurin-20.0.2+9 
>> (build 20.0.2+9)), OpenJDK 21 (OpenJDK Runtime Environment 
>> Temurin-21+35-202309042127 (build 21-beta+35-ea)
>> ), and OpenJDK 22-internal-adhoc built from the latest in source 
>> control. I haven't yet been able to produce the compilation conditions 
>> necessary in a neatly isolated JMH benchmark, but I hope to 
>> sufficiently illustrate the conditions such that inspection of the 
>> relevant code can confirm or deny whether I'm diagnosing the problem 
>> correctly.
>>
>> The crash can occur when using 
>> `FloatVector.fromArray(VectorSpecies<Float> species, float[] a, int 
>> offset, int[] indexMap, int mapOffset)` on a machine with AVX-512 
>> extensions. hsdis on the crash dump shows the crash occurs at an 
>> instruction of the form `vgatherdps (%rsi,%zmm27,4),%zmm0{%k7}`. In 
>> particular, when using VSIB addressing, I noticed the index register 
>> was used nowhere else in the disassembly and seemed incorrect, as 
>> earlier assembly preparing the indexMap used a different destination 
>> register. Preceding the referenced `vgatherdps ...` instruction, the 
>> preparation of the indexMap was using destination %zmm19. 
>> Instrumenting assembler_x86 confirms that the code generating the 
>> `vgatherdps` intended the index register encoding to be 19. The crash 
>> never occurs when `vgatherdps` is emitted using index registers 
>> available prior to AVX-512.
>>
>> With the addressing used here, I expect the index register encoding to 
>> be split across EVEX.V’, EVEX.X, and SIB.index. Since the register 
>> encoding is always off by 8 when the crash occurs, I suspect EVEX.X, 
>> the fourth bit. Inspection of `Assembler::evgatherdps` shows that 
>> `Assembler::vex_prefix` is responsible for choosing the value of 
>> EVEX.X. vex_prefix defers this responsibility to 
>> `adr.xmmindex_needs_rex()` for the source address `adr`, which checks 
>> that the encoding of xmmindex is greater than or equal to 8. This 
>> check appears insufficient for AVX-512 extensions, where we wouldn't 
>> expect the fourth bit (EVEX.X) to be set for registers 16-23.
>>
>> On an OpenJDK build I produced with patched `xmmindex_needs_rex` to 
>> not set EVEX.X for index registers zmm16-23, I was able to run the 
>> code without crashing. I'm not sure if this is a correct or 
>> comprehensive fix, but it seems to further confirm my understanding of 
>> the cause of the crash.
>>
>> Here are gists of the patch I used 
>> (https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b 
>> <https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b>) and 
>> an excerpt of the crash dump 
>> (https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2 
>> <https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2>).
>>
>> Please let me know whether additional information would be helpful in 
>> diagnosing this issue. I would appreciate any insight.
>>
>> Thanks,
>> Joel


More information about the panama-dev mailing list