RFR: 8257806: Optimize x86 allTrue and anyTrue vector mask operations of Vector API
Paul Sandoz
psandoz at openjdk.java.net
Mon Dec 7 18:58:11 UTC 2020
On Mon, 7 Dec 2020 02:09:56 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
> The allTrue and anyTrue operations are implemented using ptest/vptest instruction.
> Two optimizations are possible:
>
> 1) The ptest instruction minimum size is 128 bit.
> Smaller < 128 bit size operations can be implemented by first broadcasting (duplicating) the input to 128 bits.
> The two inputs to these operations are:
> a) Vector mask being tested
> b) All ones
> For allTrue operation, both the inputs need to be broadcasted.
> For anyTrue operation, only the first input (vector mask) need to be broadcasted.
>
> 2) The anyTrue operation followed by comparison with zero can use the zero flag generated by ptest/vptest directly.
Verified locally that code gen looks good. Due to some internal configuration changes this PR likely needs to be rebased to ensure correct test execution (open and internal configs are out of sync due to some recent configuration changes).
-------------
PR: https://git.openjdk.java.net/jdk/pull/1656
More information about the hotspot-compiler-dev
mailing list