RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v3]
Paul Sandoz
psandoz at openjdk.java.net
Wed May 11 15:07:57 UTC 2022
On Wed, 11 May 2022 03:23:13 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
>> I modified the code of this PR to avoid the conversion of `boolean` to `int`, so a constant integer value is passed all the way through, and the masked load is made intrinsic from the method at which the constants are passed as arguments i.e. the public `fromArray` mask accepting method.
>
> Hi @PaulSandoz , thanks for the patch for the constant int parameter. I think the main change is:
>
> - ByteVector fromArray0Template(Class<M> maskClass, C base, long offset, int index, M m, boolean offsetInRange,
> + ByteVector fromArray0Template(Class<M> maskClass, C base, long offset, int index, M m, int offsetInRange,
> VectorSupport.LoadVectorMaskedOperation<C, ByteVector, ByteSpecies, M> defaultImpl) {
> m.check(species());
> ByteSpecies vsp = vspecies();
> - if (offsetInRange) {
> - return VectorSupport.loadMasked(
> - vsp.vectorType(), maskClass, vsp.elementType(), vsp.laneCount(),
> - base, offset, m, /* offsetInRange */ 1,
> - base, index, vsp, defaultImpl);
> - } else {
> - return VectorSupport.loadMasked(
> - vsp.vectorType(), maskClass, vsp.elementType(), vsp.laneCount(),
> - base, offset, m, /* offsetInRange */ 0,
> - base, index, vsp, defaultImpl);
> - }
> + return VectorSupport.loadMasked(
> + vsp.vectorType(), maskClass, vsp.elementType(), vsp.laneCount(),
> + base, offset, m, offsetInRange == 1 ? 1 : 0,
> + base, index, vsp, defaultImpl);
> }
>
> which uses `offsetInRange == 1 ? 1 : 0`. Unfortunately this could not always make sure the `offsetInRange` a constant a the compiler time. Again, this change could also make the assertion fail randomly:
>
> --- a/src/hotspot/share/opto/vectorIntrinsics.cpp
> +++ b/src/hotspot/share/opto/vectorIntrinsics.cpp
> @@ -1236,6 +1236,7 @@ bool LibraryCallKit::inline_vector_mem_masked_operation(bool is_store) {
> } else {
> // Masked vector load with IOOBE always uses the predicated load.
> const TypeInt* offset_in_range = gvn().type(argument(8))->isa_int();
> + assert(offset_in_range->is_con(), "must be a constant");
> if (!offset_in_range->is_con()) {
> if (C->print_intrinsics()) {
> tty->print_cr(" ** missing constant: offsetInRange=%s",
>
> Sometimes, the compiler can parse it a constant. I think this depends on the compiler OSR and speculative optimization. Did you try an example with IOOBE on a non predicated hardware?
>
> Here is the main code of my unittest to reproduce the issue:
>
> static final VectorSpecies<Integer> I_SPECIES = IntVector.SPECIES_128;
> static final int LENGTH = 1026;
> public static int[] ia;
> public static int[] ib;
>
> private static void init() {
> for (int i = 0; i < LENGTH; i++) {
> ia[i] = i;
> ib[i] = 0;
> }
>
> for (int i = 0; i < 2; i++) {
> m[i] = i % 2 == 0;
> }
> }
>
> private static void func() {
> VectorMask<Integer> mask = VectorMask.fromArray(I_SPECIES, m, 0);
> for (int i = 0; i < LENGTH; i += vl) {
> IntVector av = IntVector.fromArray(I_SPECIES, ia, i, mask);
> av.lanewise(VectorOperators.ABS).intoArray(ic, i, mask);
> }
> }
>
> public static void main(String[] args) {
> init();
> for (int i = 0; i < 10000; i++) {
> func();
> }
> }
@XiaohongGong Doh! The ternary was an experiment, and I forgot to re-run the code gen script before sending your the patch. See `IntVector`, which does not have that. I presume when the offset is not in range and the other code path is taken then it might be problematic unless all code paths are inlined. I will experiment further with tests.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8035
More information about the hotspot-compiler-dev
mailing list