String encodeASCII
Brett Okken
brett.okken.os at gmail.com
Sat Feb 18 18:36:13 UTC 2023
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/String.java#L976-L981
For String.encodeASCII, with the LATIN1 coder is there any interest in
exploring the performance impacts of utilizing a
byteArrayViewVarHandle to read/write as longs and utilize a bitmask to
identify if negative values are present?
A simple jmh benchmark covering either 0 or 1 non-ascii (negative)
values shows times cut in half (or more) for most scenarios with
strings ranging in length from 3 - ~2000.
VM version: JDK 17.0.6, OpenJDK 64-Bit Server VM, 17.0.6+10
Windows 10 Intel(R) Core(TM) i7-9850H
Hand unrolling the loops shows noted improvement, but does make for
less aesthetically pleasing code.
Benchmark (nonascii) (size) Mode
Cnt Score Error Units
AsciiEncodeBenchmark.jdk none 3 avgt
4 15.531 ± 1.122 ns/op
AsciiEncodeBenchmark.jdk none 10 avgt
4 17.350 ± 0.473 ns/op
AsciiEncodeBenchmark.jdk none 16 avgt
4 18.277 ± 0.421 ns/op
AsciiEncodeBenchmark.jdk none 23 avgt
4 20.139 ± 0.350 ns/op
AsciiEncodeBenchmark.jdk none 33 avgt
4 22.008 ± 0.656 ns/op
AsciiEncodeBenchmark.jdk none 42 avgt
4 24.393 ± 1.155 ns/op
AsciiEncodeBenchmark.jdk none 201 avgt
4 55.884 ± 0.645 ns/op
AsciiEncodeBenchmark.jdk none 511 avgt
4 120.817 ± 2.917 ns/op
AsciiEncodeBenchmark.jdk none 2087 avgt
4 471.039 ± 13.329 ns/op
AsciiEncodeBenchmark.jdk first 3 avgt
4 15.794 ± 1.494 ns/op
AsciiEncodeBenchmark.jdk first 10 avgt
4 18.446 ± 0.780 ns/op
AsciiEncodeBenchmark.jdk first 16 avgt
4 20.458 ± 0.394 ns/op
AsciiEncodeBenchmark.jdk first 23 avgt
4 22.934 ± 0.422 ns/op
AsciiEncodeBenchmark.jdk first 33 avgt
4 25.367 ± 0.178 ns/op
AsciiEncodeBenchmark.jdk first 42 avgt
4 28.620 ± 0.678 ns/op
AsciiEncodeBenchmark.jdk first 201 avgt
4 80.250 ± 4.376 ns/op
AsciiEncodeBenchmark.jdk first 511 avgt
4 185.518 ± 6.370 ns/op
AsciiEncodeBenchmark.jdk first 2087 avgt
4 713.213 ± 13.488 ns/op
AsciiEncodeBenchmark.jdk last 3 avgt
4 14.991 ± 0.190 ns/op
AsciiEncodeBenchmark.jdk last 10 avgt
4 18.284 ± 0.317 ns/op
AsciiEncodeBenchmark.jdk last 16 avgt
4 20.591 ± 1.002 ns/op
AsciiEncodeBenchmark.jdk last 23 avgt
4 22.560 ± 0.963 ns/op
AsciiEncodeBenchmark.jdk last 33 avgt
4 25.521 ± 0.554 ns/op
AsciiEncodeBenchmark.jdk last 42 avgt
4 28.484 ± 0.446 ns/op
AsciiEncodeBenchmark.jdk last 201 avgt
4 79.434 ± 2.256 ns/op
AsciiEncodeBenchmark.jdk last 511 avgt
4 186.639 ± 4.258 ns/op
AsciiEncodeBenchmark.jdk last 2087 avgt
4 725.196 ± 149.416 ns/op
AsciiEncodeBenchmark.longCheckCopy none 3 avgt
4 7.222 ± 0.428 ns/op
AsciiEncodeBenchmark.longCheckCopy none 10 avgt
4 8.070 ± 0.171 ns/op
AsciiEncodeBenchmark.longCheckCopy none 16 avgt
4 6.711 ± 0.409 ns/op
AsciiEncodeBenchmark.longCheckCopy none 23 avgt
4 12.906 ± 3.633 ns/op
AsciiEncodeBenchmark.longCheckCopy none 33 avgt
4 10.414 ± 0.447 ns/op
AsciiEncodeBenchmark.longCheckCopy none 42 avgt
4 11.935 ± 1.235 ns/op
AsciiEncodeBenchmark.longCheckCopy none 201 avgt
4 29.538 ± 3.265 ns/op
AsciiEncodeBenchmark.longCheckCopy none 511 avgt
4 106.228 ± 68.475 ns/op
AsciiEncodeBenchmark.longCheckCopy none 2087 avgt
4 494.845 ± 890.985 ns/op
AsciiEncodeBenchmark.longCheckCopy first 3 avgt
4 7.775 ± 0.278 ns/op
AsciiEncodeBenchmark.longCheckCopy first 10 avgt
4 13.396 ± 2.072 ns/op
AsciiEncodeBenchmark.longCheckCopy first 16 avgt
4 13.528 ± 0.702 ns/op
AsciiEncodeBenchmark.longCheckCopy first 23 avgt
4 17.376 ± 0.360 ns/op
AsciiEncodeBenchmark.longCheckCopy first 33 avgt
4 16.251 ± 0.203 ns/op
AsciiEncodeBenchmark.longCheckCopy first 42 avgt
4 17.932 ± 1.773 ns/op
AsciiEncodeBenchmark.longCheckCopy first 201 avgt
4 39.028 ± 4.699 ns/op
AsciiEncodeBenchmark.longCheckCopy first 511 avgt
4 92.599 ± 11.078 ns/op
AsciiEncodeBenchmark.longCheckCopy first 2087 avgt
4 347.728 ± 7.837 ns/op
AsciiEncodeBenchmark.longCheckCopy last 3 avgt
4 7.472 ± 0.078 ns/op
AsciiEncodeBenchmark.longCheckCopy last 10 avgt
4 8.371 ± 0.815 ns/op
AsciiEncodeBenchmark.longCheckCopy last 16 avgt
4 6.766 ± 0.253 ns/op
AsciiEncodeBenchmark.longCheckCopy last 23 avgt
4 12.879 ± 0.454 ns/op
AsciiEncodeBenchmark.longCheckCopy last 33 avgt
4 10.491 ± 0.811 ns/op
AsciiEncodeBenchmark.longCheckCopy last 42 avgt
4 12.435 ± 1.212 ns/op
AsciiEncodeBenchmark.longCheckCopy last 201 avgt
4 28.507 ± 1.058 ns/op
AsciiEncodeBenchmark.longCheckCopy last 511 avgt
4 85.763 ± 1.941 ns/op
AsciiEncodeBenchmark.longCheckCopy last 2087 avgt
4 411.555 ± 3.595 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 3 avgt
4 5.858 ± 0.637 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 10 avgt
4 7.031 ± 0.274 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 16 avgt
4 6.768 ± 0.222 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 23 avgt
4 10.084 ± 0.102 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 33 avgt
4 9.876 ± 0.240 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 42 avgt
4 11.061 ± 0.590 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 201 avgt
4 29.264 ± 1.690 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 511 avgt
4 61.920 ± 5.482 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll none 2087 avgt
4 309.183 ± 42.354 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 3 avgt
4 5.687 ± 0.249 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 10 avgt
4 9.537 ± 0.337 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 16 avgt
4 9.928 ± 0.329 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 23 avgt
4 12.510 ± 0.519 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 33 avgt
4 13.028 ± 0.335 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 42 avgt
4 13.640 ± 0.219 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 201 avgt
4 31.046 ± 0.647 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 511 avgt
4 82.998 ± 5.611 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll first 2087 avgt
4 360.294 ± 8.419 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 3 avgt
4 5.657 ± 0.197 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 10 avgt
4 6.997 ± 0.081 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 16 avgt
4 6.890 ± 1.319 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 23 avgt
4 10.154 ± 0.389 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 33 avgt
4 9.986 ± 0.592 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 42 avgt
4 11.481 ± 0.375 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 201 avgt
4 29.286 ± 0.723 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 511 avgt
4 61.056 ± 0.977 ns/op
AsciiEncodeBenchmark.longCheckCopy_unroll last 2087 avgt
4 303.415 ± 17.326 ns/op
@Benchmark
public byte[] jdk() {
final byte[] val = this.data;
byte[] dst = Arrays.copyOf(val, val.length);
for (int i = 0; i < dst.length; i++) {
if (dst[i] < 0) {
dst[i] = '?';
}
}
return dst;
}
@Benchmark
public byte[] longCheckCopy() {
final byte[] val = this.data;
byte[] dst = new byte[val.length];
int i = 0;
long word;
for (int j=dst.length - 7; i < j; i+=8) {
word = (long)LONG_BYTES.get(val, i);
LONG_BYTES.set(dst, i, word);
if ((word & LONG_NEG_MASK) != 0) {
for (int x=i, y=i+8; x<y; x++) {
if (dst[x] < 0) {
dst[x] = '?';
}
}
}
}
byte b;
for (; i < dst.length; i++) {
b = val[i];
dst[i] = b < 0 ? (byte) '?' : b;
}
return dst;
}
@Benchmark
public byte[] longCheckCopy_unroll() {
final byte[] val = this.data;
byte[] dst = new byte[val.length];
int i = 0;
long word;
for (int j=dst.length - 7; i < j; i+=8) {
word = (long)LONG_BYTES.get(val, i);
LONG_BYTES.set(dst, i, word);
if ((word & LONG_NEG_MASK) != 0) {
if (dst[i] < 0) {
dst[i] = '?';
}
if (dst[i + 1] < 0) {
dst[i + 1] = '?';
}
if (dst[i + 2] < 0) {
dst[i + 2] = '?';
}
if (dst[i + 3] < 0) {
dst[i + 3] = '?';
}
if (dst[i + 4] < 0) {
dst[i + 4] = '?';
}
if (dst[i + 5] < 0) {
dst[i + 5] = '?';
}
if (dst[i + 6] < 0) {
dst[i + 6] = '?';
}
if (dst[i + 7] < 0) {
dst[i + 7] = '?';
}
}
}
byte b;
switch (dst.length & 0x7) {
case 7:
b = val[i + 6];
dst[i + 6] = b < 0 ? (byte) '?' : b;
case 6:
b = val[i + 5];
dst[i + 5] = b < 0 ? (byte) '?' : b;
case 5:
b = val[i + 4];
dst[i + 4] = b < 0 ? (byte) '?' : b;
case 4:
b = val[i + 3];
dst[i + 3] = b < 0 ? (byte) '?' : b;
case 3:
b = val[i + 2];
dst[i + 2] = b < 0 ? (byte) '?' : b;
case 2:
b = val[i + 1];
dst[i + 1] = b < 0 ? (byte) '?' : b;
case 1:
b = val[i];
dst[i] = b < 0 ? (byte) '?' : b;
}
return dst;
}
Thanks,
Brett
More information about the core-libs-dev
mailing list