[vectpr] Mask as a loop variable

Paul Sandoz paul.sandoz at oracle.com
Wed Mar 28 16:41:41 UTC 2018


Here is a preliminary experiment to use a mask as a loop variable. I’ll send a more polished patch later but i just wanted to share stuff now:

  http://cr.openjdk.java.net/~psandoz/panama/mask-bounds/webrev/

And here is an example:

static <S extends Vector.Shape> int mismatch(byte[] a, byte[] b, ByteVector.ByteSpecies<S> species) {
    int length = Math.min(a.length, b.length);
    if (a == b)
        return -1;

    Vector.Mask<Byte, S> loadMask = species.maskFromBounds(0, length);
    for (int i = 0;
         i < length;
         i += species.length(), loadMask = species.maskFromBounds(i, length)) {
        Vector<Byte, S> va = species.fromArray(a, i, loadMask);
        Vector<Byte, S> vb = species.fromArray(b, i, loadMask);
        Vector.Mask<Byte, S> mneq = va.notEqual(vb); // <—— require mask accepting versions
        if (mneq.anyTrue()) {
            return i + mneq.leadingFalseCount();  <—— using new method as shown in prior email
        }
    }

    return (a.length != b.length) ? length : -1;
}


I believe that its possible to support this kind of pattern even when masks registers are not supported by peeling the tail of the loop. Thus the main part of the loop need not use masks or use a constant mask of all true values. As long as the mask loop variable does not escape and can be tracked it should be possible to optimize.


Some observations:

- the support of mask accepting methods using blend does not work for cases where there are side effects, such as loading or storing vectors where the indexes associated with masked off lanes are out of bounds. For these cases i think we need an alternative short-term optimization strategy. Unsure if this also applies to div by 0 for floating point vectors.

- the comparison functions, such as notEqual, need mask accepting variants?

- need a method on species to clip an upper bound e.g. species.clip(int l) which is equivalent to
    (l & ~(species.length() - 1))

- i need to review the hash code method to see what is applicable, this is a little tricker because of casting to ints, and the polynomial changes when processing the tail.

Paul.


More information about the panama-dev mailing list