RFR: 8315585: Optimization for decimal to string [v7]
Shaojin Wen
duke at openjdk.org
Sat Oct 21 11:59:40 UTC 2023
On Thu, 19 Oct 2023 11:58:26 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> @cl4es
>>
>>> Good, narrows it down to what's going on in `prepend(long, byte[], String)`. Might boil down to `System.arraycopy`. This method might not be optimized for tiny arrays on all platforms. Specializing for single-char case:
>>>
>>> ```java
>>> diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java
>>> index 9b19d7e2ac1..6eb70925dab 100644
>>> --- a/src/java.base/share/classes/java/lang/String.java
>>> +++ b/src/java.base/share/classes/java/lang/String.java
>>> @@ -4723,7 +4723,11 @@ static void repeatCopyRest(byte[] buffer, int offset, int limit, int copied) {
>>> */
>>> void getBytes(byte[] dst, int dstBegin, byte coder) {
>>> if (coder() == coder) {
>>> - System.arraycopy(value, 0, dst, dstBegin << coder, value.length);
>>> + if (value.length == 1) {
>>> + dst[(dstBegin << coder)] = value[0];
>>> + } else {
>>> + System.arraycopy(value, 0, dst, dstBegin << coder, value.length);
>>> + }
>>> } else { // this.coder == LATIN && coder == UTF16
>>> StringLatin1.inflate(value, 0, dst, dstBegin, value.length);
>>> }
>>> ```
>>>
>>> .. seem to help the JIT do the right thing consistently, too:
>>>
>>> ```
>>> Benchmark Mode Cnt Score Error Units
>>> BigDecimals.testSmallToEngineeringString avgt 50 11,757 ± 0,480 ns/op
>>> ```
>>
>> In addition to #16244, will you submit a PR for this?
>
>> In addition to #16244, will you submit a PR for this?
>
> Once both #16244 and this has been integrated I want to revisit this. The effect of changing `getBytes(byte[], int, byte)` might have disappeared with #16244 since it better guarantees the JIT will constant fold the prefixes thoroughly. We've observed related issues with System.arraycopy, however, see https://bugs.openjdk.org/browse/JDK-8295496 - so I want to evaluate a few different options here, time allowing.
@cl4es I found that StringConcatFactory.makeConcatWithConstants will become slower when using recipe("\1.\1", long.class, long.class). If we treat the prefix of length 1 as char, it will become faster.
In this branch https://github.com/wenshao/jdk/tree/optim_decimal_to_string_x2 , if we remove the code that treats the prefix of length 1 as char in the preparer(String prefix, Class<?> cl) method, it will become slower.
package java.lang.invoke;
class StringConcatFactory {
static MethodHandle prepender(String prefix, Class<?> cl) {
if (prefix == null || prefix.isEmpty()) {
return noPrefixPrepender(cl);
} else {
// remove this will slower
if (prefix.length() == 1
&& (cl == char.class || cl == int.class || cl == long.class)) {
char ch = prefix.charAt(0);
MethodHandle prepend = JLA.stringConcatHelper(
"prepend",
methodType(long.class, long.class, byte[].class, cl, char.class)).rebind();
return MethodHandles.insertArguments(prepend, 3, ch);
}
return MethodHandles.insertArguments(
prepender(cl), 3, prefix);
}
}
}
Here are the performance numbers running on the MaxBook M1 Max:
-Benchmark Mode Cnt Score Error Units (String prefix)
-BigDecimals.smallScale3ToPlainString avgt 15 28.898 ? 0.527 ns/op
+Benchmark Mode Cnt Score Error Units (char prefix)
+BigDecimals.smallScale3ToPlainString avgt 15 23.182 ? 0.477 ns/op
Maybe you should submit another PR to improve StringConcatFactory.makeConcatWithConstants
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16006#issuecomment-1773768539
More information about the core-libs-dev
mailing list