RFR: 8287697: Limit auto vectorization to 32-byte vector on Cascade Lake [v2]

Thu Jun 2 05:50:31 UTC 2022

On Thu, 2 Jun 2022 05:24:51 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> I think we missed the test with setting `MaxVectorSize` to 32 (vs 64) on Cascade Lake CPU. We should do that.
>> 
>> That may be preferable "simple fix" vs suggested changes for "short term solution".
>> 
>> The objection was that user may still want to use wide 64 bytes vectors for Vector API. But I agree with Jatin argument about that.
>> Limiting `MaxVectorSize` **will** affect our intrinsics/stubs code and may affect performance. That is why we need to test it. I will ask Eric.
>> 
>> BTW, `SuperWordMaxVectorSize` should be diagnostic or experimental since it is temporary solution.
>
>> @vnkozlov I have made SuperWordMaxVectorSize as a develop option as you suggested. As far as I know, the only intrinsics/stubs that uses MaxVectorSize are for clear/copy. This is done in conjunction with AVX3Threshold so we are ok there for Cascade Lake.
> 
> Thank you for checking stubs code.
> 
> We still have to run performance testing with this patch. We need only additional run with `MaxVectorSize=32` to compare results.
> 
> And I want @jatin-bhateja to approve this change too. Or give better suggestion.

> @vnkozlov Your review comments are resolved. @jatin-bhateja This is a simple fix for the problem in the short time frame that we have for the upcoming feature freeze. A more complex fix to enhance auto-vectorizer is a good thought.

Hi @sviswa7 . This looks reasonable since stubs and some macro assembly routines anyways operate under thresholds and does not strictly comply with max vector size.

-------------

PR: https://git.openjdk.java.net/jdk/pull/8877