[aarch64-port-dev ] RFR(L): 8231441: AArch64: Initial SVE backend support

Fri Aug 28 05:56:56 UTC 2020

Hi Vladimir,

Thanks a lot for helping clarifying your concerns which will benefit 
future direction.

On 8/27/20 8:54 PM, Vladimir Ivanov wrote:
> Hi Andrew,
> 
>> So, if I can summarize (please correct me if I misunderstand):
>>
>>     You are as concerned about existing complexity in vector handling as
>> much as complexity added by this patch, whether the latter is to AArch64
>> code or shared code.
>>
>>     The goal you would like to achieve is a single set of rules for a
>> single kind of vector register whose size is parameterized, the
>> appropriate value being derived from each specific vector operation.
>>
>>     Your main concern about this patch is that it adds yet another
>> additional vector kind to the current 'wrong' multi-kind vector model
>> and, what is worse, one with a different behaviour, taking us further
>> from your desired goal.
> 
> Yes, correct.
> 
>>     Your other concern is that this design does not allow for the AArch64
>> ISA predication or, indeed, for what you treat uniformly as the
>> 'implicit' predication imposed on a 'logical' max vector size (2048
>> bits) by the specific AVX/SVE/NEON hardware vector size.
> 
> No, I'm not concerned about that. I mentioned SVE implicit predication
> to illustrate that there's a higher-level abstraction in the JVM above
> ISA level which hides some of the functionality ISA exposes. And I'm
> perfectly fine with that.
> 
>>> But you should definitely prefer 1-slot design for vector registers then
>>> ;-)
>>
>> Indeed I do :-]
>>
>> So, let me respond to the above summary points, assuming I have them
>> down right.
>>
>> I agree that your end goal is highly desirable. However, we are not
>> there yet and since your attempts to do so have not succeeded so far I
>> don't think that means we are compelled to drop the current patch. As
>> you say this could (and, if it is adopted, should) be regarded as a
>> useful stop-gap until we come up with a unified, parameterized vector
>> implementation that makes it redundant.
> 
> Unfortunately, there was simply not enough motivation on x86 (and hence
> resources spent) to address it there. Vector API support for x86
> stretched the implementation in a different direction: combinatorial
> explosion of AD instructions needed to cover all useful cases. It
> required switching to full-width vectors in x86.ad file which left RA
> concerns waiting next opportunity.
> 
>> That said, I'm not pushing hard to keep the patch if the consequence is
>> generating significant work later to undo it. The number of users who
>> might benefit from using SVE vectors from Java now or in the near future
>> does not look like it is going to be very large (if you are not making a
>> lot of use of SVE registers then that is a lot of wasted silicon and I
>> suspect it's going to be the rare case that someone codes an app in Java
>> that needs to make continuous use of SVE -- mind you, by the same token
>> I guess that also applies for AVX on Intel).
> 
> I don't consider RA part of the patch as the show-stopper issue for
> initial SVE support. As I said to Ningsheng, I'm fine with the patch as
> it is now if we agree it's a stop-the-gap solution and there's a
> commitment to invest into the proper support.
> 
> I initially put options #1/#2 (which don't require any changes in RA
> shared code) as possible alternatives way to temporarily address the
> problem. Both require additional simplifying assumptions and hence I
> didn't insist they should be chosen.
> 
>> I'm not sure pushing this now will add a lot more work later. It seems
>> to me that this code is actually moving in the right direction for the
>> sort of solution you want. The AArch64 VecA register /is/
>> size-parameterized, albeit by a size fixed at startup rather than per
>> operation. So, that's one reason why I don't know if this implies a lot
>> more rework to move towards your desired goal. Surely, if we do arrive
>> at a unifying vector model that can replace the existing multi-kind
>> vectors then it ought to be able to subsume this code - unless of course
>> it replaces it wholesale.
>>
>> Are you concerned that adding this patch will result in more cases to
>> pick through and correct?
>>
>> Are you worried that we might have to withdraw some of the support this
>> patch enables to arrive at the final goal?
>>
>> Also, Ningsheng and his colleagues have laid some foundations for
>> implementing predicated operations with this patch and have that work in
>> the pipeline. Once again this is moving towards the desired goal even if
>> it might end up doign so in a slightly sideways fashion. Perhaps we
>> could continue this stop-gap experiment as an experimental option in
>> order to learn from the experience?
> 
> I definitely don't want to hinder/block the impressive work Ningsheng
> and others at Arm are doing for SVE support.
> 
> Frankly speaking, my main concern is that the implementation can stay
> that way forever ;-) That's why I'm trying to get enough ground covered
> in the discussion and some agreements/commitments to be made before it
> is integrated.
> 
> I don't have any strong objections to the patch which could justify
> blocking its integration, but on a higher-level I do voice my concerns
> about where it pushes the implementation longer-term.
> 
> Unfortunately, as it is shaped now, I don't see how x86 can benefit from
> it. So, I'm afraid this particular route with vecA and _is_scalable bit
> will stay purely AArch64-specific exercise.
> 
> Leaving RA part aside, I have one suggestion which should help in the
> future: let's try to consistently follow full-width vector abstraction.
> In AD file, vecA operand is way too similar to vecX et al which makes a
> wrong impression it's yet another vector flavor. So, choosing a better
> name will help when representation changes. For example, x86 moved away
> from vecX/... operands to a single generic one (called "vec") and you
> can take a loot at x86.ad to see the result.
> 

Thanks for the suggestion. In current implementation vecA does not 
include vecD/vecX for NEON - so actually it's regarded as another vector 
flavor. We try to keep the SVE implementation separated from original 
NEON code (and a new ad file is also introduced), to make the code 
better maintainable and reviewable. What do you think about this naming, 
Andrew?

Thanks,
Ningsheng