Potential FloatVector.fromArray miscompilation when using index map on AVX-512 machine
David Holmes
david.holmes at oracle.com
Thu Oct 5 03:00:26 UTC 2023
I wonder if this might be relevant:
https://bugs.openjdk.org/browse/JDK-8317121
?
Cheers,
David
On 5/10/2023 5:07 am, Joel Knighton wrote:
> Hello,
> I have a very repeatable JVM crash when using the Panama Vector API that
> appears to be due to improper assembly emitted when compiling
> FloatVector.fromArray in certain circumstances. This crash occurs on (at
> least) OpenJDK 20 (OpenJDK Runtime Environment Temurin-20.0.2+9 (build
> 20.0.2+9)), OpenJDK 21 (OpenJDK Runtime Environment
> Temurin-21+35-202309042127 (build 21-beta+35-ea)
> ), and OpenJDK 22-internal-adhoc built from the latest in source
> control. I haven't yet been able to produce the compilation conditions
> necessary in a neatly isolated JMH benchmark, but I hope to sufficiently
> illustrate the conditions such that inspection of the relevant code can
> confirm or deny whether I'm diagnosing the problem correctly.
>
> The crash can occur when using
> `FloatVector.fromArray(VectorSpecies<Float> species, float[] a, int
> offset, int[] indexMap, int mapOffset)` on a machine with AVX-512
> extensions. hsdis on the crash dump shows the crash occurs at an
> instruction of the form `vgatherdps (%rsi,%zmm27,4),%zmm0{%k7}`. In
> particular, when using VSIB addressing, I noticed the index register was
> used nowhere else in the disassembly and seemed incorrect, as earlier
> assembly preparing the indexMap used a different destination register.
> Preceding the referenced `vgatherdps ...` instruction, the preparation
> of the indexMap was using destination %zmm19. Instrumenting
> assembler_x86 confirms that the code generating the `vgatherdps`
> intended the index register encoding to be 19. The crash never occurs
> when `vgatherdps` is emitted using index registers available prior to
> AVX-512.
>
> With the addressing used here, I expect the index register encoding to
> be split across EVEX.V’, EVEX.X, and SIB.index. Since the register
> encoding is always off by 8 when the crash occurs, I suspect EVEX.X, the
> fourth bit. Inspection of `Assembler::evgatherdps` shows that
> `Assembler::vex_prefix` is responsible for choosing the value of EVEX.X.
> vex_prefix defers this responsibility to `adr.xmmindex_needs_rex()` for
> the source address `adr`, which checks that the encoding of xmmindex is
> greater than or equal to 8. This check appears insufficient for AVX-512
> extensions, where we wouldn't expect the fourth bit (EVEX.X) to be set
> for registers 16-23.
>
> On an OpenJDK build I produced with patched `xmmindex_needs_rex` to not
> set EVEX.X for index registers zmm16-23, I was able to run the code
> without crashing. I'm not sure if this is a correct or comprehensive
> fix, but it seems to further confirm my understanding of the cause of
> the crash.
>
> Here are gists of the patch I used
> (https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b
> <https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b>) and an
> excerpt of the crash dump
> (https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2
> <https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2>).
>
> Please let me know whether additional information would be helpful in
> diagnosing this issue. I would appreciate any insight.
>
> Thanks,
> Joel
More information about the panama-dev
mailing list