[vector] AVX2 ByteVector.shiftR performance and semantics
Richard Startin
richard at openkappa.co.uk
Tue Jan 29 21:28:23 UTC 2019
I noticed that in the current implementation ByteVector.shiftR is much slower than the equivalent logic using IntVector for AVX2. I don't fully understand the compiled code in the ByteVector case but most of the time is spent inside VectorIntrinsics::broadcastInt (see attachment):
@Benchmark
public int shiftRByte() {
return ((ByteVector)YMM_LONG.fromArray(data, 0)
.rebracket(YMM_BYTE)).shiftR(4).and((byte)0x0F)
.get(0);
}
@Benchmark
public int shiftRInt() {
return ((ByteVector)((IntVector)YMM_LONG.fromArray(data, 0)
.rebracket(YMM_INT)).shiftR(4).and(0x0F0F0F0F)
.rebracket(YMM_BYTE))
.get(0);
}
Benchmark (size) Mode Cnt Score Error Units
PopCount.shiftRByte 1024 thrpt 5 29.310 ± 0.680 ops/us
PopCount.shiftRInt 1024 thrpt 5 257.261 ± 17.210 ops/us
It's not clear whether shiftR is a signed or unsigned shift. If it is an unsigned shift, I think it can be implemented efficiently in terms of ints, which already works very well.
Currently, ByteVector.shiftR seems to preserve sign, at least in the interpreter. Printing
println(((ByteVector)YMM_LONG.fromArray(data, 0).rebracket(YMM_BYTE)));
println(((ByteVector)YMM_LONG.fromArray(data, 0).rebracket(YMM_BYTE)).shiftR(4));
println(((ByteVector)((IntVector)YMM_LONG.fromArray(data, 0).rebracket(YMM_INT)).shiftR(4).and(0x0F0F0F0F)
.rebracket(YMM_BYTE)));
printed this:
[9, 15, -41, 21, 85, 116, -18, 41, 30, -8, -40, -96, 119, -75, 119, -85, -48, -28, -55, 84, 118, 7, -46, 15, -103, -37, 48, -52, 14, -32, -105, -84]
[0, 0, -3, 1, 5, 7, -2, 2, 1, -1, -3, -6, 7, -5, 7, -6, -3, -2, -4, 5, 7, 0, -3, 0, -7, -3, 3, -4, 0, -2, -7, -6]
[0, 0, 13, 1, 5, 7, 14, 2, 1, 15, 13, 10, 7, 11, 7, 10, 13, 14, 12, 5, 7, 0, 13, 0, 9, 13, 3, 12, 0, 14, 9, 10]
On the other hand, I saw that the shr instruction was used in the JIT compiled code.
Thanks,
Richard
<http://0x80.pl/articles/sse-popcount.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: perfasm.txt
URL: <https://mail.openjdk.java.net/pipermail/panama-dev/attachments/20190129/50764e70/perfasm-0001.txt>
More information about the panama-dev
mailing list