Potential FloatVector.fromArray miscompilation when using index map on AVX-512 machine
Joel Knighton
joel.knighton at datastax.com
Tue Oct 10 23:02:48 UTC 2023
You're welcome -- thanks all for taking the time to reply. I want to ensure
that this issue isn't lost in the shuffle and can be triaged/tracked
appropriately. Is this the right place for the mentioned folks to comment,
or might I better off taking this to somewhere like hotspot-compiler-dev?
This is my first time attempting to report an issue directly to OpenJDK and
would value any advice.
Best,
Joel
On Wed, Oct 4, 2023 at 3:44 PM Paul Sandoz <paul.sandoz at oracle.com> wrote:
> Hi Joel,
>
> Thank you for the detailed report, it's very helpful. I’ll let Intel folks
> more knowledgable than I on HotSpot+AVX-512 comment on the specifics. I am
> presuming a simpler reproducing test case needs to induce just enough
> register pressure, which may be tricky to do so reliably.
>
> Paul.
>
> > On Oct 4, 2023, at 12:07 PM, Joel Knighton <joel.knighton at datastax.com>
> wrote:
> >
> > Hello,
> > I have a very repeatable JVM crash when using the Panama Vector API that
> appears to be due to improper assembly emitted when compiling
> FloatVector.fromArray in certain circumstances. This crash occurs on (at
> least) OpenJDK 20 (OpenJDK Runtime Environment Temurin-20.0.2+9 (build
> 20.0.2+9)), OpenJDK 21 (OpenJDK Runtime Environment
> Temurin-21+35-202309042127 (build 21-beta+35-ea)
> > ), and OpenJDK 22-internal-adhoc built from the latest in source
> control. I haven't yet been able to produce the compilation conditions
> necessary in a neatly isolated JMH benchmark, but I hope to sufficiently
> illustrate the conditions such that inspection of the relevant code can
> confirm or deny whether I'm diagnosing the problem correctly.
> >
> > The crash can occur when using
> `FloatVector.fromArray(VectorSpecies<Float> species, float[] a, int offset,
> int[] indexMap, int mapOffset)` on a machine with AVX-512 extensions. hsdis
> on the crash dump shows the crash occurs at an instruction of the form
> `vgatherdps (%rsi,%zmm27,4),%zmm0{%k7}`. In particular, when using VSIB
> addressing, I noticed the index register was used nowhere else in the
> disassembly and seemed incorrect, as earlier assembly preparing the
> indexMap used a different destination register. Preceding the referenced
> `vgatherdps ...` instruction, the preparation of the indexMap was using
> destination %zmm19. Instrumenting assembler_x86 confirms that the code
> generating the `vgatherdps` intended the index register encoding to be 19.
> The crash never occurs when `vgatherdps` is emitted using index registers
> available prior to AVX-512.
> >
> > With the addressing used here, I expect the index register encoding to
> be split across EVEX.V’, EVEX.X, and SIB.index. Since the register encoding
> is always off by 8 when the crash occurs, I suspect EVEX.X, the fourth bit.
> Inspection of `Assembler::evgatherdps` shows that `Assembler::vex_prefix`
> is responsible for choosing the value of EVEX.X. vex_prefix defers this
> responsibility to `adr.xmmindex_needs_rex()` for the source address `adr`,
> which checks that the encoding of xmmindex is greater than or equal to 8.
> This check appears insufficient for AVX-512 extensions, where we wouldn't
> expect the fourth bit (EVEX.X) to be set for registers 16-23.
> >
> > On an OpenJDK build I produced with patched `xmmindex_needs_rex` to not
> set EVEX.X for index registers zmm16-23, I was able to run the code without
> crashing. I'm not sure if this is a correct or comprehensive fix, but it
> seems to further confirm my understanding of the cause of the crash.
> >
> > Here are gists of the patch I used (
> https://gist.github.com/jkni/d81262034917c4039c6f3a217dc5c04b) and an
> excerpt of the crash dump (
> https://gist.github.com/jkni/5854501cce6ce23e15d93eae1f7c2da2).
> >
> > Please let me know whether additional information would be helpful in
> diagnosing this issue. I would appreciate any insight.
> >
> > Thanks,
> > Joel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20231010/d5d9aaac/attachment-0001.htm>
More information about the panama-dev
mailing list