RFR: 8293198: [vectorapi] Improve the implementation of VectorMask.indexInRange()

Xiaohong Gong xgong at openjdk.org
Thu Feb 2 02:53:22 UTC 2023


On Thu, 2 Feb 2023 01:59:09 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>> Your patch re-organized the code to take care of following cases :-
>> 1.   offset < 0                         : Still falling over to old code.
>> 2.   offset >= limit                  : Special case optimized 
>> 3.   (limit - offset) >= length  :  Most general case.
>> 4.   Partial in range check       :  New intrinsic, existing java side implementation in checkIndex0 is also composed of intrinsified calls,  infrequently taken paths will anyways lead to uncommon traps.
>> 
>> Have you tried introducing just case 3 (first) and then case 2 in existing indexInRange implementation.
>
> Thanks for the review @jatin-bhateja !
> 
>> Have you tried introducing just case 3 (first) and then case 2 in existing indexInRange implementation.
> 
> Yes, I tried with this way as well, together with the performance testing compared with current version. The result is:
> 1. For cases that the array size is aligned with the vector size, the performance is almost the same with each other.
> 2. For cases that the array size is not aligned (which needs the masked operation), the performance with current version can improve about 5%~6% compared with the way as you suggested. But the improvement is limited to the cases with smaller vector size, whose tail loop size is bigger.
> 
> So, adding the new intrinsic has no side effect for me, except for the more complex java side code and one additional intrinsic.

> Can you move this else if check at the top, this is the most general case appearing in the loop and hence two extra uncommon trap jumps before it for special cases may penalize this.

The generated code does not have any difference by moving the most general case at the top. It still will generate the uncommon trap jumps before it. Besides, moving it at the top may make the code in java side harder to understand. The code may seem like:

public VectorMask<E> indexInRange(int offset, int limit) {
        if (offset > 0 && limit - offset >= length()) {
            return this;
        } else if (offset > 0 && offset < limit) {
            return this.and(indexInRange0(offset, limit));
        } else if (offset >= limit) {
            return vectorSpecies().maskAll(false);
        } else {
           return this.and(indexInRange0Helper(offset, limit));
        }
}
``` 

But as what you suggested above, if we remove the new intrinsic and keep using the "checkIndex0" to handle the tail loop, the code may be easier, such as:

public VectorMask<E> indexInRange(int offset, int limit) {
        if (offset > 0 && limit - offset >= length()) {
            return this;
        } else if (offset >= limit) {
            return vectorSpecies().maskAll(false);
        } else {
           return this.and(indexInRange0Helper(offset, limit));
        }
}

-------------

PR: https://git.openjdk.org/jdk/pull/12064


More information about the hotspot-compiler-dev mailing list