RFR: 8327652: S390x: Implements SLP support [v9]

Andrew Haley aph at openjdk.org
Mon Aug 26 16:57:06 UTC 2024


On Mon, 26 Aug 2024 13:39:35 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Please don't revert it. The old code wasn't good. The problem should be understood and fixed. Otherwise, this PR should not get integrated. It's a high risk to integrate buggy vector code. We had tons of crashes on PPC64 with the initial version. Also see https://bugs.openjdk.org/browse/JDK-8188802. Does the CPU you were building on support the vector instructions (SuperwordUseVX and UseSFPV)?
>
>> Please don't revert it. The old code wasn't good. The problem should be understood and fixed. Otherwise, this PR should not get integrated. It's a high risk to integrate buggy vector code. We had tons of crashes on PPC64 with the initial version. Also see https://bugs.openjdk.org/browse/JDK-8188802.
> 
> Mmm, but C2 is very much designed around 32-bit slots, and the register allocator assumes that every vec is some multiple of 32-bit slots. Here:
> 
> 
>   // SlotsPerLong is 2, since slots are 32 bits and longs are 64 bits.
>   // Also, consider the maximum alignment size for a normally allocated
>   // value.  Since we allocate register pairs but not register quads (at
>   // present), this alignment is SlotsPerLong (== 2).  A normally
>   // aligned allocated register is either a single register, or a pair
>   // of adjacent registers, the lower-numbered being even.
>   // See also is_aligned_Pairs() below, and the padding added before
>   // Matcher::_new_SP to keep allocated pairs aligned properly.
>   // If we ever go to quad-word allocations, SlotsPerQuad will become
>   // the controlling alignment constraint.  Note that this alignment
>   // requirement is internal to the allocator, and independent of any
>   // particular platform.
>   enum { SlotsPerLong = 2,
>          SlotsPerVecA = 4,
>          SlotsPerVecS = 1,
>          SlotsPerVecD = 2,
>          SlotsPerVecX = 4,
>          SlotsPerVecY = 8,
>          SlotsPerVecZ = 16,
>          SlotsPerRegVectMask = X86_ONLY(2) NOT_X86(1)
>          };
> 
> 
> This change that removes the subwords from the `reg_def` is wrong: https://github.com/openjdk/jdk/pull/18162/commits/3caa470c0f89be306e5b43c5da4ca9e625abfe6b . It is, in essence, lying to the register allocator. Sure, you can get away with this as long as you never have to refer directly to individual registers, which is what Amit's patch is doing.
> 
> We must ensure that every vector register, as declared, is a multiple of the correct number of slots in order for all this to work correctly.

> @theRealAph: I didn't get the "lying to the register allocator" part. `SlotsPerVecX` is defined to be 4 slots and a slot has 4 Bytes. 16 Bytes is the correct vector width. Note that VecX is also used on PPC64 for example. Do you mean that we need to multiply the encoding() by 4? Can't that be fixed without reverting most of the new code? (only `max_vr = max_fpr + VectorRegister::number_of_registers * VectorRegister::max_slots_per_register` and `(encoding() * VectorRegister::max_slots_per_register) + ConcreteRegisterImpl::max_fpr`)

No, I mean that we need this:


  reg_def Z_VR16   ( SOC, SOC, Op_RegF, 16, Z_V16->as_VMReg()          );
  reg_def Z_VR16_H ( SOC, SOC, Op_RegF, 16, Z_V16->as_VMReg()->next()  );
  reg_def Z_VR16_J ( SOC, SOC, Op_RegF, 16, Z_V16->as_VMReg()->next(2) );
  reg_def Z_VR16_K ( SOC, SOC, Op_RegF, 16, Z_V16->as_VMReg()->next(3) );

and

reg_class z_v_reg(
  // Attention: Only these ones are saved & restored at safepoint by RegisterSaver.
  //1st 16 VRs overlaps with 1st 16 FPRs.
  Z_VR16, Z_VR16_H, Z_VR16_J, Z_VR16_K,


Because without the extra dummy register declarations, `Z_VR16` is only allocated a single `VMReg`, but it needs to be allocated 4. I have no intention of touching anything else in this commit, just restoring the vector register declarations, which were working just fine.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2310647831


More information about the hotspot-dev mailing list