RFR: 8327652: S390x: Implements SLP support [v9]

Andrew Haley aph at openjdk.org
Mon Aug 26 13:42:07 UTC 2024


On Mon, 26 Aug 2024 11:05:08 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

> Please don't revert it. The old code wasn't good. The problem should be understood and fixed. Otherwise, this PR should not get integrated. It's a high risk to integrate buggy vector code. We had tons of crashes on PPC64 with the initial version. Also see https://bugs.openjdk.org/browse/JDK-8188802.

Mmm, but C2 is very much designed around 32-bit slots, and the register allocator assumes that every vec is some multiple of 32-bit slots. Here:


  // SlotsPerLong is 2, since slots are 32 bits and longs are 64 bits.
  // Also, consider the maximum alignment size for a normally allocated
  // value.  Since we allocate register pairs but not register quads (at
  // present), this alignment is SlotsPerLong (== 2).  A normally
  // aligned allocated register is either a single register, or a pair
  // of adjacent registers, the lower-numbered being even.
  // See also is_aligned_Pairs() below, and the padding added before
  // Matcher::_new_SP to keep allocated pairs aligned properly.
  // If we ever go to quad-word allocations, SlotsPerQuad will become
  // the controlling alignment constraint.  Note that this alignment
  // requirement is internal to the allocator, and independent of any
  // particular platform.
  enum { SlotsPerLong = 2,
         SlotsPerVecA = 4,
         SlotsPerVecS = 1,
         SlotsPerVecD = 2,
         SlotsPerVecX = 4,
         SlotsPerVecY = 8,
         SlotsPerVecZ = 16,
         SlotsPerRegVectMask = X86_ONLY(2) NOT_X86(1)
         };


This change that removes the subwords from the `reg_def` is wrong: https://github.com/openjdk/jdk/pull/18162/commits/3caa470c0f89be306e5b43c5da4ca9e625abfe6b . It is, in essence, lying to the register allocator. Sure, you can get away with this as long as you never have to refer directly to individual registers, which is what Amit's patch is doing.

We must ensure that every vector register, as declared, is a multiple of the correct number of slots in order for all this to work correctly.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2310243935


More information about the hotspot-dev mailing list