Potential FloatVector.fromArray miscompilation when using index map on AVX-512 machine
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Thu Oct 5 03:07:40 UTC 2023
> I wonder if this might be relevant:
>
> https://bugs.openjdk.org/browse/JDK-8317121
No, these bugs are unrelated. What Joel describes is an issue with
instruction encoding (wrong register ends up encoded in the vgatherdps),
while JDK-8317121 happens during C2 IR transformations where a vector
load erroneously may erroneously bypass a dominating array copy.
Best regards,
Vladimir Ivanov
> On 5/10/2023 5:07 am, Joel Knighton wrote:
>> Hello,
>> I have a very repeatable JVM crash when using the Panama Vector API
>> that appears to be due to improper assembly emitted when compiling
>> FloatVector.fromArray in certain circumstances. This crash occurs on
>> (at least) OpenJDK 20 (OpenJDK Runtime Environment Temurin-20.0.2+9
>> (build 20.0.2+9)), OpenJDK 21 (OpenJDK Runtime Environment
>> Temurin-21+35-202309042127 (build 21-beta+35-ea)
>> ), and OpenJDK 22-internal-adhoc built from the latest in source
>> control. I haven't yet been able to produce the compilation conditions
>> necessary in a neatly isolated JMH benchmark, but I hope to
>> sufficiently illustrate the conditions such that inspection of the
>> relevant code can confirm or deny whether I'm diagnosing the problem
>> correctly.
>>
>> The crash can occur when using
>> `FloatVector.fromArray(VectorSpecies<Float> species, float[] a, int
>> offset, int[] indexMap, int mapOffset)` on a machine with AVX-512
>> extensions. hsdis on the crash dump shows the crash occurs at an
>> instruction of the form `vgatherdps (%rsi,%zmm27,4),%zmm0{%k7}`. In
>> particular, when using VSIB addressing, I noticed the index register
>> was used nowhere else in the disassembly and seemed incorrect, as
>> earlier assembly preparing the indexMap used a different destination
>> register. Preceding the referenced `vgatherdps ...` instruction, the
>> preparation of the indexMap was using destination %zmm19.
>> Instrumenting assembler_x86 confirms that the code generating the
>> `vgatherdps` intended the index register encoding to be 19. The crash
>> never occurs when `vgatherdps` is emitted using index registers
>> available prior to AVX-512.
>>
>> With the addressing used here, I expect the index register encoding to
>> be split across EVEX.V’, EVEX.X, and SIB.index. Since the register
>> encoding is always off by 8 when the crash occurs, I suspect EVEX.X,
>> the fourth bit. Inspection of `Assembler::evgatherdps` shows that
>> `Assembler::vex_prefix` is responsible for choosing the value of
>> EVEX.X. vex_prefix defers this responsibility to
>> `adr.xmmindex_needs_rex()` for the source address `adr`, which checks
>> that the encoding of xmmindex is greater than or equal to 8. This
>> check appears insufficient for AVX-512 extensions, where we wouldn't
>> expect the fourth bit (EVEX.X) to be set for registers 16-23.
>>
>> On an OpenJDK build I produced with patched `xmmindex_needs_rex` to
>> not set EVEX.X for index registers zmm16-23, I was able to run the
>> code without crashing. I'm not sure if this is a correct or
>> comprehensive fix, but it seems to further confirm my understanding of
>> the cause of the crash.
>>
>> Here are gists of the patch I used
>> (https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b
>> <https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b>) and
>> an excerpt of the crash dump
>> (https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2
>> <https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2>).
>>
>> Please let me know whether additional information would be helpful in
>> diagnosing this issue. I would appreciate any insight.
>>
>> Thanks,
>> Joel
More information about the panama-dev
mailing list