[vectorapi] Shape of the preferred species with UseAVX=0

Paul Sandoz paul.sandoz at oracle.com
Wed May 24 17:36:01 UTC 2023


Hi Chris,

getMaxLaneCount will return -1 *if* HotSpot is compiled with C2 disabled, a special case. It is not currently affected when C2 is disabled at runtime, we could check that too but need to think through the implications.

The Vector API degrades with functional equivalence but does not currently generate the same code as if it were scalar code. Unfortunately the fallback implementation in pure Java is currently slow since it operates on instances of Vector etc as ordinary classes. There is no C1 support to optimize this case.

You should not rely on the UseAVX/UseSSE flags to restrict (they are really there to emulate). For testing purposes I currently recommend setting the MaxVectorSize to limit and check the preferred species or the value of VectorShape.S_Max_BIT.vectorBitSize(), or otherwise explicit detect C2 is disabled (not clear how easy that is).

Paul.

On May 24, 2023, at 5:20 AM, Chris Hegarty <chegar999 at gmail.com> wrote:

Hi,

Thanks for your reply. Lemme try to clarify the actual source of the issue we see, UseAVX=0 might be related but also could be a red herring.

Some Lucene test runs with C2 effectively "disabled", with -XX:TieredStopAtLevel=1, and we observe horribly slow performance - much worse than the scalar equivalent. And we see that the preferred species is still wider than expected.

$ java --add-modules jdk.incubator.vector -XX:TieredStopAtLevel=1 PrintPreferredVectorSize
WARNING: Using incubator modules: jdk.incubator.vector
Species[int, 16, S_512_BIT]

But (and again this could be a red herring), VectorShape::getMaxVectorBitSize indicates that this should not be the case, e.g.

// VectorSupport.getMaxLaneCount may return -1 if C2 is not enabled,
// or a value smaller than the S_64_BIT.vectorBitSize / elementSizeInBits if MaxVectorSize < 16
// If so default to S_64_BIT

I had assume that this is because the code will not benefit from the C2 intrinsics.

Basically, how so we determine whether the environment should use the Vector API or not (fallback to a scalar implementation)? I would assume this is a question that we should not have to answer - the Vectorized implementation should degrade gracefully.

-Chris.

On 24/05/2023 12:47, Quân Anh Mai wrote:
Hi,
In x86 there are 2 vector extension generations, SSE and AVX, and the flags controlling these are UseSSE and UseAVX, respectively. By setting UseAVX to 0, you are emulating an SSE4 machine, which has 16-byte vector registers. 64-bit VM requires SSE2, so you cannot set a value lower than 2 for UseSSE, and for all such values there are always 16-byte vector registers available. As a result, the minimum preferred vector size on x86_64 is 16 bytes.
Regards,
Quan Anh
On Wed, 24 May 2023 at 19:39, Chris Hegarty <chegar999 at gmail.com<mailto:chegar999 at gmail.com> <mailto:chegar999 at gmail.com>> wrote:
   Hi,
   Over in Lucene-land we're experimenting with the Vector API, and ran
   into an issue when testing against different environments. We're
   effectively simulating different environments with Hotspot's command
   line flags, and see an issue with `-XX:UseAVX=0`. With this option we
   expect the shape of the preferred species to fallback to 64 bits,
   but it
   is actually 128.
   Trivial reproducer that just prints the preferred int species:
   $ java -version
   openjdk version "21-ea" 2023-09-19
   OpenJDK Runtime Environment (build 21-ea+23-1988)
   OpenJDK 64-Bit Server VM (build 21-ea+23-1988, mixed mode, sharing)
   $ cat PrintPreferredVectorSize.java
   import jdk.incubator.vector.*;
   public class PrintPreferredVectorSize {
         public static void main(String... args) {
             System.out.println(IntVector.SPECIES_PREFERRED);
         }
   }
   // Start with the default - no flags, then downsize with MaxVectorSize.
   // All looks good.
   $ java --add-modules jdk.incubator.vector PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 16, S_512_BIT]
   $ java --add-modules jdk.incubator.vector -XX:MaxVectorSize=64
   PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 16, S_512_BIT]
   $ java --add-modules jdk.incubator.vector -XX:MaxVectorSize=32
   PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 8, S_256_BIT]
   $ java --add-modules jdk.incubator.vector -XX:MaxVectorSize=16
   PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 4, S_128_BIT]
   $ java --add-modules jdk.incubator.vector -XX:MaxVectorSize=8
   PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 2, S_64_BIT]
   ^^^ this all is as expected ^^^
   Now do similar(ish) with UseAVX
   $ java --add-modules jdk.incubator.vector -XX:UseAVX=3
   PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 16, S_512_BIT]
   $ java --add-modules jdk.incubator.vector -XX:UseAVX=2
   PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 8, S_256_BIT]
   $ java --add-modules jdk.incubator.vector -XX:UseAVX=1
   PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 4, S_128_BIT]
   $ java --add-modules jdk.incubator.vector -XX:UseAVX=0
   PrintPreferredVectorSize
   WARNING: Using incubator modules: jdk.incubator.vector
   Species[int, 4, S_128_BIT]
   It is this last one that is surprising to us. We expect similar
   -XX:MaxVectorSize=8, which is Species[int, 2, S_64_BIT]
   Firstly, is this a bug? Is our expectation correct? If not, then I'm
   failing to understand something.
   If I'm not mistaken, then I can see that the
   Matcher::vector_width_in_bytes assumes a minimum of 16 regardless of
   whether UseAVX is < 1. Trivially, and as a hack, I just retrofitted
   vector_width_in_bytes to return 0 if UseAVX < 1. And we get
   Species[int, 2, S_64_BIT].
   The layer of these flags is not straightforward, so I won't pretend
   that
   my hack is the right way to fix this, but I just wanted to ensure
   that I
   was looking in the correct general area.
   Thanks,
   -Chris.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20230524/390fbbf7/attachment-0001.htm>


More information about the panama-dev mailing list