[15] RFR (L): 8235824: C2: Merge AD instructions for AddReductionV and MulReductionV nodes
    Vladimir Ivanov 
    vladimir.x.ivanov at oracle.com
       
    Fri Dec 13 08:38:25 UTC 2019
    
    
  
Thanks for the reviews, Vladimir & John.
> As I mentioned in the 8235756 thread, a good way to factor the
> implementation of (associative) reductions would be to reformulate
> them as the repeated composition of 2N-to-N-lane reductions.
> 
> For non-associative reductions (floating point), the 2N-to-N
> pattern is acceptable, *if* the reduction is specified to happen in
> that order.  To get that permission into the contract will require
> a distinction between reduceSequential and reduceParallel
> operations in the Vector API.
> 
> That sequence of 16 vaddss operations is certainly an eyesore,
> but it’s not clear how to improve on it, algorithmically.  Perhaps
> it could be factored into a sequential accumulation operation,
> to be repeated N times instead of lg N times.
Yes, I agree that reduction nodes look too high-level for matching 
purposes: having a node per reduction step is much more suitable (at 
least, on x86). I think if the IR is shaped that way (nested reduction 
steps which reduce a vector to a scalar), there's a way to introduce a 
single shared IR node which represents a reduction step (2N => N) across 
all vector shapes.
Best regards,
Vladimir Ivanov
    
    
More information about the hotspot-compiler-dev
mailing list