large longs to string

Wed Apr 24 19:39:27 UTC 2024

Hi,

IIUC this optimization leans on 4 long divs being slower than 1 long div + 4 int divs, which might not be true on all platforms, nor stay true in the future. Long values will in practice likely be biased towards lower values, so it’s important that any optimization to .. longer values doesn’t regress inputs in the int range. Since it’s all guarded by a test that is already there there shouldn’t be much room for a difference, but adding code can cause interesting issues so it’s always worth measuring to make sure. Have you run any benchmark for inputs smaller than the threshold? And for a healthy mix of values?

Thanks!
/Claes

24 apr. 2024 kl. 21:08 skrev Brett Okken <brett.okken.os at gmail.com>:

Is there interest in optimizing StringLatin1.getChars(long, int, byte[]) for large (larger than int) long values[1]?
We can change this to work with 8 digits at a time, which reduces the amount of 64 bit arithmetic required.

if (i <= -1_000_000_000) {
long q = i / 100_000_000;
charPos -= 8;
write4DigitPairs(buf, charPos, (int) ((q * 100_000_000) - i));
i = q;
if (i <= -1_000_000_000) {
q = i / 100_000_000;
charPos -= 8;
write4DigitPairs(buf, charPos, (int) ((q * 100_000_000) - i));
i = q;
}
}

A simple implementation of write4DigitPairs would just call the existing writeDigitPair method 4 times:

private static void write4DigitPairs(byte[] buf, int idx, int value) {
int v = value;
int v2 = v / 100;
writeDigitPair(buf, idx + 6, v - (v2 * 100));
v = v2;

v2 = v / 100;
writeDigitPair(buf, idx + 4, v - (v2 * 100));
v = v2;

v2 = v / 100;
writeDigitPair(buf, idx + 2, v - (v2 * 100));
v = v2;

v2 = v / 100;
writeDigitPair(buf, idx, v - (v2 * 100));
}

There is the option to OR the 4 short values together into a long and leverage a ByteArrayLittleEndian.setLong call, but I see that the previous usage of ByteArrayLittleEndian.setShort was removed[2].

A small benchmark of longs which would qualify shows up to 20% improvement.

Presumably a similar change could make sense for StringUTF16, but I have not spent any time benchmarking it.

Brett

[1] - https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/StringLatin1.java#L163-L168
[2] - https://github.com/openjdk/jdk/commit/913e43fea995b746fb9e1b25587d254396c7c3c9

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20240424/317781c5/attachment-0001.htm>