RFR: 8327652: S390x: Implements SLP support [v7]

Martin Doerr mdoerr at openjdk.org
Mon Apr 1 09:04:36 UTC 2024


On Wed, 27 Mar 2024 13:24:39 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:

>>> I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before [d5adf1d](https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28) which doesn't use vector registers. The benefit was not huge if I remember correctly.
>> 
>> Agreed. My proposed circumvention is a too dirty hack. 
>> 
>> I would prefer to add KILL effects to the match rules. I believe the vector implementation had a substantial performance effect. Unfortunately, I can't find any records of performance results from back then. 
>> 
>> Reverting the commit @TheRealMDoerr mentioned is not possible. It contains many additions that may have been used by unrelated code. The vector code is well encapsulated and could be removed by deleting the 
>> 
>>   if (VM_Version::has_VectorFacility()) {
>>   } 
>> 
>> block. I would not like that, though.
>
>> > I think we shouldn't allow `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)` to use vector registers without specifying this effect. That can be solved by adding a KILL effect for all vector registers which are killed. Alternatively, we could revert to the old implementation before [d5adf1d](https://github.com/openjdk/jdk/commit/d5adf1df921e5ecb8ff4c7e4349a12660069ed28) which doesn't use vector registers. The benefit was not huge if I remember correctly.
>> 
>> Agreed. My proposed circumvention is a too dirty hack.
>> 
>> I would prefer to add KILL effects to the match rules. I believe the vector implementation had a substantial performance effect. Unfortunately, I can't find any records of performance results from back then.
>> 
>> Reverting the commit @TheRealMDoerr mentioned is not possible. It contains many additions that may have been used by unrelated code. The vector code is well encapsulated and could be removed by deleting the
>> 
>> ```
>>   if (VM_Version::has_VectorFacility()) {
>>   } 
>> ```
>> 
>> block. I would not like that, though.
> 
> I didn't mean to back out the whole commit. Only the implementation of string_compress and string_expand. The benefit of the vector version certainly depends on what kind of strings are used. (Effect may also be negative in some cases.) I think that classical benchmarks didn't show a significant performance impact, but I don't remember exactly, either. I'll leave the s390 maintainers free to decide if they want to adapt the vector version or go for the short and simple implementation.

> @TheRealMDoerr and @RealLucy Just for my understanding why GPR and FPR doesn't get affected in intrinsic code as they are also allocated outside of register allocator? why only vector registers usage in intrinsic code get affected or am I missing anything here?

GPRs and FPRs already have an `effect` specified in the match rules. (If a GPR or FPR is used by a `MachNode` without proper specification, it is a critical bug.) Before this PR, it is legal to use VRs without `effect` because they are not used by register allocation. This is exploited in `MacroAssembler::string_compress(...)` and `MacroAssembler::string_expand(...)`.

With your change, these 2 intrinsics may overwrite live values! Unfortunately, they use many VRs. So, specifying a `KILL` effect for all of them may cause the register allocation to insert much spill code. May impact performance and code size.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/18162#issuecomment-2029440755


More information about the hotspot-dev mailing list