On generalizing vector intrinsics

Fri Nov 17 18:54:35 UTC 2017

On Nov 16, 2017, at 1:17 PM, Vladimir Ivanov <vladimir.x.ivanov at oracle.com> wrote:
> 
> Andrew,
> 
>>> FYI I did a quick experiment with more generic vector intrinsics and
>>> wanted to share the first results. The motivation was to explore
>>> possible reduction in the number of intrinsics needed to support Vector
>>> API in the JVM.
>> Out of interest ... has anyone looked at the suitability of the
>> intrinsics for non-Intel architectures?  Obviously I'm concerned about
>> the possibility of ending up with a bunch of C2 patterns that don't map
>> onto, say, my favourite architecture.
> 
> I haven't heard about any experiments with Vector API on non-x86 architectures, but if C2 knows how to lower vector ideal nodes on your favorite architecture, then Vector API intrinsics "just work".

That's where we are aiming.  It might seem risky to base the design only
on x86 experiments, but…  the x86 is 3 or 4 different vector architectures
combined.  Thus, Java's WORA value proposition is valuable just as a
tactic for tracking x86 generations.

Thus, the design as written should work for any fixed-sized vector architecture.
That *probably* includes SVE, since the runtime sense of vector size can be
folded into the vector-selection factories.

The design is *intended* to work for variable-sized vector architectures *also*,
but so far we have only been thinking about that, not experimenting.

An intermediate step will be a "partial vector with mask" abstraction which
will layer over the AVX (and SVE) vectors in order to support loop edges
(startup and cleanup, pre- and post-loop).  Making sure that works is a step
towards vectors with data-dependent sizes and shapes.

— John