VectorMask.firstTrue()/lastTrue()
forax at univ-mlv.fr
forax at univ-mlv.fr
Thu Apr 15 11:33:42 UTC 2021
----- Mail original -----
> De: "Vladimir Ivanov" <vladimir.x.ivanov at oracle.com>
> À: "Paul Sandoz" <paul.sandoz at oracle.com>, "Remi Forax" <forax at univ-mlv.fr>
> Cc: "panama-dev at openjdk.java.net'" <panama-dev at openjdk.java.net>
> Envoyé: Jeudi 15 Avril 2021 13:21:04
> Objet: Re: VectorMask.firstTrue()/lastTrue()
>> I would like to add intrinsic operations for counting leading and trailing zeros
>> (in addition to popcount), then use those operations for implementing
>> first/lastTrue.
>>
>> firstTrue == count-leading-zeros
>>
>> lastTrue == length - count-trailing-zeros - 1
>>
>> The above should help answer your question about the return value. But it does
>> make me wonder if the result for "not found" is non-obvious and we should at
>> least clarify in the Javadoc. The method names likely do not help in this
>> regard.
>
> In order to rely on leading/trailing zeroes, you have to convert the
> mask in canonical form (bit per element) first.
yes, using something like PEXT
>
> There's already VectorMask.toLong() present which has to be intrinsified
> first. Then you can rely on existing intrinsics
> Long.numberOfTrailingZeros()/numberOfTrailingZeros().
>
> (Also, VectorMask.toLong() is not SVE-friendly where a predicate
> register can hold a value up to 256-bit in size.)
and PEXT is in microcode for all AMDs before Zen3 :(
>
> Best regards,
> Vladimir Ivanov
Rémi
>
>>> On Apr 14, 2021, at 5:35 AM, Remi Forax <forax at univ-mlv.fr> wrote:
>>>
>>> There is a way to implement firstTrue and lastTrue using only vector operations.
>>>
>>> The idea is to use to mask iota with the mask and then do a reduceLanes with MIN
>>> or MAX.
>>> By example to for firstTrue() that returns -1 if the mask is empty,
>>> first create iotaPlusOne, blend with zero, then reduceLanes with MIN, then
>>> substract 1
>>>
>>> VectorMask mask = ...
>>> var iotaPlusOne = ByteVector.broadcast(SPECIES, 1).addIndex(1);
>>> ByteVector.zero(SPECIES).blend(iotaPlusOne,
>>> mask).reduceLanes(VectorOperators.MIN) - 1;
>>>
>>> BTW, i wonder why firstTrue return the mask length instead of -1.
>>>
>>> regards,
>>> Rémi
More information about the panama-dev
mailing list