[aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend support

Thu Aug 27 12:54:23 UTC 2020

Hi Andrew,

> So, if I can summarize (please correct me if I misunderstand):
> 
>    You are as concerned about existing complexity in vector handling as
> much as complexity added by this patch, whether the latter is to AArch64
> code or shared code.
> 
>    The goal you would like to achieve is a single set of rules for a
> single kind of vector register whose size is parameterized, the
> appropriate value being derived from each specific vector operation.
> 
>    Your main concern about this patch is that it adds yet another
> additional vector kind to the current 'wrong' multi-kind vector model
> and, what is worse, one with a different behaviour, taking us further
> from your desired goal.

Yes, correct.

>    Your other concern is that this design does not allow for the AArch64
> ISA predication or, indeed, for what you treat uniformly as the
> 'implicit' predication imposed on a 'logical' max vector size (2048
> bits) by the specific AVX/SVE/NEON hardware vector size.

No, I'm not concerned about that. I mentioned SVE implicit predication 
to illustrate that there's a higher-level abstraction in the JVM above 
ISA level which hides some of the functionality ISA exposes. And I'm 
perfectly fine with that.

>> But you should definitely prefer 1-slot design for vector registers then
>> ;-)
> 
> Indeed I do :-]
> 
> So, let me respond to the above summary points, assuming I have them
> down right.
> 
> I agree that your end goal is highly desirable. However, we are not
> there yet and since your attempts to do so have not succeeded so far I
> don't think that means we are compelled to drop the current patch. As
> you say this could (and, if it is adopted, should) be regarded as a
> useful stop-gap until we come up with a unified, parameterized vector
> implementation that makes it redundant.

Unfortunately, there was simply not enough motivation on x86 (and hence 
resources spent) to address it there. Vector API support for x86 
stretched the implementation in a different direction: combinatorial 
explosion of AD instructions needed to cover all useful cases. It 
required switching to full-width vectors in x86.ad file which left RA 
concerns waiting next opportunity.

> That said, I'm not pushing hard to keep the patch if the consequence is
> generating significant work later to undo it. The number of users who
> might benefit from using SVE vectors from Java now or in the near future
> does not look like it is going to be very large (if you are not making a
> lot of use of SVE registers then that is a lot of wasted silicon and I
> suspect it's going to be the rare case that someone codes an app in Java
> that needs to make continuous use of SVE -- mind you, by the same token
> I guess that also applies for AVX on Intel).

I don't consider RA part of the patch as the show-stopper issue for 
initial SVE support. As I said to Ningsheng, I'm fine with the patch as 
it is now if we agree it's a stop-the-gap solution and there's a 
commitment to invest into the proper support.

I initially put options #1/#2 (which don't require any changes in RA 
shared code) as possible alternatives way to temporarily address the 
problem. Both require additional simplifying assumptions and hence I 
didn't insist they should be chosen.

> I'm not sure pushing this now will add a lot more work later. It seems
> to me that this code is actually moving in the right direction for the
> sort of solution you want. The AArch64 VecA register /is/
> size-parameterized, albeit by a size fixed at startup rather than per
> operation. So, that's one reason why I don't know if this implies a lot
> more rework to move towards your desired goal. Surely, if we do arrive
> at a unifying vector model that can replace the existing multi-kind
> vectors then it ought to be able to subsume this code - unless of course
> it replaces it wholesale.
> 
> Are you concerned that adding this patch will result in more cases to
> pick through and correct?
> 
> Are you worried that we might have to withdraw some of the support this
> patch enables to arrive at the final goal?
> 
> Also, Ningsheng and his colleagues have laid some foundations for
> implementing predicated operations with this patch and have that work in
> the pipeline. Once again this is moving towards the desired goal even if
> it might end up doign so in a slightly sideways fashion. Perhaps we
> could continue this stop-gap experiment as an experimental option in
> order to learn from the experience?

I definitely don't want to hinder/block the impressive work Ningsheng 
and others at Arm are doing for SVE support.

Frankly speaking, my main concern is that the implementation can stay 
that way forever ;-) That's why I'm trying to get enough ground covered 
in the discussion and some agreements/commitments to be made before it 
is integrated.

I don't have any strong objections to the patch which could justify 
blocking its integration, but on a higher-level I do voice my concerns 
about where it pushes the implementation longer-term.

Unfortunately, as it is shaped now, I don't see how x86 can benefit from 
it. So, I'm afraid this particular route with vecA and _is_scalable bit 
will stay purely AArch64-specific exercise.

Leaving RA part aside, I have one suggestion which should help in the 
future: let's try to consistently follow full-width vector abstraction. 
In AD file, vecA operand is way too similar to vecX et al which makes a 
wrong impression it's yet another vector flavor. So, choosing a better 
name will help when representation changes. For example, x86 moved away 
from vecX/... operands to a single generic one (called "vec") and you 
can take a loot at x86.ad to see the result.

Best regards,
Vladimir Ivanov