[15] RFR (L): 8235824: C2: Merge AD instructions for AddReductionV and MulReductionV nodes
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Fri Dec 13 08:38:25 UTC 2019
Thanks for the reviews, Vladimir & John.
> As I mentioned in the 8235756 thread, a good way to factor the
> implementation of (associative) reductions would be to reformulate
> them as the repeated composition of 2N-to-N-lane reductions.
>
> For non-associative reductions (floating point), the 2N-to-N
> pattern is acceptable, *if* the reduction is specified to happen in
> that order. To get that permission into the contract will require
> a distinction between reduceSequential and reduceParallel
> operations in the Vector API.
>
> That sequence of 16 vaddss operations is certainly an eyesore,
> but it’s not clear how to improve on it, algorithmically. Perhaps
> it could be factored into a sequential accumulation operation,
> to be repeated N times instead of lg N times.
Yes, I agree that reduction nodes look too high-level for matching
purposes: having a node per reduction step is much more suitable (at
least, on x86). I think if the IR is shaped that way (nested reduction
steps which reduce a vector to a scalar), there's a way to introduce a
single shared IR node which represents a reduction step (2N => N) across
all vector shapes.
Best regards,
Vladimir Ivanov
More information about the hotspot-compiler-dev
mailing list