Potential FloatVector.fromArray miscompilation when using index map on AVX-512 machine

David Holmes david.holmes at oracle.com
Thu Oct 5 03:00:26 UTC 2023


I wonder if this might be relevant:

https://bugs.openjdk.org/browse/JDK-8317121

?

Cheers,
David

On 5/10/2023 5:07 am, Joel Knighton wrote:
> Hello,
> I have a very repeatable JVM crash when using the Panama Vector API that 
> appears to be due to improper assembly emitted when compiling 
> FloatVector.fromArray in certain circumstances. This crash occurs on (at 
> least) OpenJDK 20 (OpenJDK Runtime Environment Temurin-20.0.2+9 (build 
> 20.0.2+9)), OpenJDK 21 (OpenJDK Runtime Environment 
> Temurin-21+35-202309042127 (build 21-beta+35-ea)
> ), and OpenJDK 22-internal-adhoc built from the latest in source 
> control. I haven't yet been able to produce the compilation conditions 
> necessary in a neatly isolated JMH benchmark, but I hope to sufficiently 
> illustrate the conditions such that inspection of the relevant code can 
> confirm or deny whether I'm diagnosing the problem correctly.
> 
> The crash can occur when using 
> `FloatVector.fromArray(VectorSpecies<Float> species, float[] a, int 
> offset, int[] indexMap, int mapOffset)` on a machine with AVX-512 
> extensions. hsdis on the crash dump shows the crash occurs at an 
> instruction of the form `vgatherdps (%rsi,%zmm27,4),%zmm0{%k7}`. In 
> particular, when using VSIB addressing, I noticed the index register was 
> used nowhere else in the disassembly and seemed incorrect, as earlier 
> assembly preparing the indexMap used a different destination register. 
> Preceding the referenced `vgatherdps ...` instruction, the preparation 
> of the indexMap was using destination %zmm19. Instrumenting 
> assembler_x86 confirms that the code generating the `vgatherdps` 
> intended the index register encoding to be 19. The crash never occurs 
> when `vgatherdps` is emitted using index registers available prior to 
> AVX-512.
> 
> With the addressing used here, I expect the index register encoding to 
> be split across EVEX.V’, EVEX.X, and SIB.index. Since the register 
> encoding is always off by 8 when the crash occurs, I suspect EVEX.X, the 
> fourth bit. Inspection of `Assembler::evgatherdps` shows that 
> `Assembler::vex_prefix` is responsible for choosing the value of EVEX.X. 
> vex_prefix defers this responsibility to `adr.xmmindex_needs_rex()` for 
> the source address `adr`, which checks that the encoding of xmmindex is 
> greater than or equal to 8. This check appears insufficient for AVX-512 
> extensions, where we wouldn't expect the fourth bit (EVEX.X) to be set 
> for registers 16-23.
> 
> On an OpenJDK build I produced with patched `xmmindex_needs_rex` to not 
> set EVEX.X for index registers zmm16-23, I was able to run the code 
> without crashing. I'm not sure if this is a correct or comprehensive 
> fix, but it seems to further confirm my understanding of the cause of 
> the crash.
> 
> Here are gists of the patch I used 
> (https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b 
> <https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b>) and an 
> excerpt of the crash dump 
> (https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2 
> <https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2>).
> 
> Please let me know whether additional information would be helpful in 
> diagnosing this issue. I would appreciate any insight.
> 
> Thanks,
> Joel


More information about the panama-dev mailing list