RFR: 8354473: Incorrect results for compress/expand tests with -XX:+EnableX86ECoreOpts

Jatin Bhateja jbhateja at openjdk.org
Thu May 1 06:23:45 UTC 2025


On Wed, 23 Apr 2025 16:21:51 GMT, Volodymyr Paprotski <vpaprotski at openjdk.org> wrote:

> > @vpaprotsk Can you please give a little more details about what exactly went wrong here, and why your change is correct?
> 
> @eme64 Thanks for looking. Point form in attempt to be concise:
> 
> * Jatin brought this to my attention, we weren't sure whose code was at fault (i.e. I wrote the blend emulation, he wrote the compress_expand) and I got to the investigation first (i.e. see https://bugs.openjdk.org/browse/JDK-8354473)
> * The mask for vblendvps instruction; actual instruction only cares about the MSB but for emulation we must have the mask to be either `FFF..FF` or `000..00`. In many places blend is used, this is already the case, so no need to recompute the mask. That's why the flag is provided (i.e. optimization).
> * (Without fully understanding the entirety of compress_expand), it appears to me that in this function the mask in `permv` _must_ be computed explicitly. That's why the flag is changed.

Hi @vpaprotsk , @eme64,

Just to fill in the missing details about compress/expand handling on AVX2, we maintain an in-memory lookup table of permutation indices corresponding to a mask value. Each row of lookup table either holds a valid permute index, which is a positive index value less than the vector lane count OR a -1 index.

Since blend emulation always expects to operate over a blend mask vector whose lanes either hold a -1 or a 0 value hence there is a need to re-compose the desired blend mask by signed extending the MSB bits to fill the entire lane. Your fix to recompute the mask looks good to me. 


Best Regards,
Jatin

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24645#issuecomment-2844181540


More information about the hotspot-compiler-dev mailing list