RFR: 8318509: x86 count_positives intrinsic broken for -XX:AVX3Threshold=0 [v2]
Claes Redestad
redestad at openjdk.org
Tue Oct 24 09:19:34 UTC 2023
On Tue, 24 Oct 2023 09:03:01 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Claes Redestad has updated the pull request incrementally with two additional commits since the last revision:
>>
>> - Remove redundant 0 + tail
>> - Randomize array segment sizes and the amount of negatives. Add some tests that limit negatives to a tail.
>
> test/hotspot/jtreg/compiler/intrinsics/string/TestHasNegatives.java line 118:
>
>> 116: initialize(off, len, maxNegatives);
>> 117: boolean r = Helper.StringCodingHasNegatives(bytes, off, len);
>> 118: if (r ^ ((maxNegatives == 0) ? false : true)) {
>
> This condition looks very obscure. Is it not equivalent to: `r ^ maxNegatives != 0`
> Or even `r == (maxNegatives == 0)`?
The `countPositives` intrinsics were adapted from the pre-existing `hasNegatives` intrinsic, and to ease porting over all platforms - including those we don't maintain - we specified `countPositives` so that if there's a negative value at index `idx` then it's acceptable for the intrinsics to return any value from `0` to `idx - 1`. This way intrinsics on all platforms could quickly be adapted over (return either `0` for `hasNegatives == true` or `length` for `hasNegatives == false`) while respective maintainers worked on porting over properly.
This also allows the intrinsic to do a quick approximation: E.g. if a vectorized loop has checked 128 bytes but then detects a negative in the next block then the intrinsic can opt to return 128 rather than whatever the exact count. The surrounding java code will then do a scalar loop that is roughly as fast as an intrinsic tail routine would be, and with less complexity. I believe the `aarch64` intrinsic is implemented like this.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16267#discussion_r1369859057
More information about the hotspot-compiler-dev
mailing list