<i18n dev> RFR: 8333396: Performance regression of new DecimalFormat and DecimalFormat.format [v2]
lingjun-cg
duke at openjdk.org
Tue Jun 4 01:39:12 UTC 2024
On Mon, 3 Jun 2024 06:13:35 GMT, lingjun-cg <duke at openjdk.org> wrote:
>> ### Performance regression of DecimalFormat.format
>> From the output of perf, we can see the hottest regions contain atomic instructions. But when run with JDK 11, there is no such problem. The reason is the removed biased locking.
>> The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself contains many synchronized methods.
>> So I added support for some new methods that accept StringBuilder which is lock-free.
>>
>> ### Performance regression of new DecimalFormat
>> After comparing the flame graph between current jdk and jdk 11, the method java.text.DecimalFormatSymbols#findNonFormatChar takes a significant time. The performance becomes as good as jdk11 after replacing it with a simple loop implementation.
>>
>>
>>
>> ### Test result
>>
>> @BenchmarkMode(Mode.AverageTime)
>> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @State(Scope.Thread)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> public class JmhDecimalFormat {
>>
>> private DecimalFormat format;
>>
>> @Setup(Level.Trial)
>> public void setup() {
>> format = new DecimalFormat("#0.00000");
>> }
>>
>> @Benchmark
>> public void testNewAndFormat() throws InterruptedException {
>> new DecimalFormat("#0.00000").format(9524234.1236457);
>> }
>>
>> @Benchmark
>> public void testNewOnly() throws InterruptedException {
>> new DecimalFormat("#0.00000");
>> }
>>
>> @Benchmark
>> public void testFormatOnly() throws InterruptedException {
>> format.format(9524234.1236457);
>> }
>> }
>>
>> #### Current JDK before optimize
>>
>> Benchmark Mode Cnt Score Error Units
>> JmhDecimalFormat.testFormatOnly avgt 50 642.099 ? 1.253 ns/op
>> JmhDecimalFormat.testNewAndFormat avgt 50 989.307 ? 3.676 ns/op
>> JmhDecimalFormat.testNewOnly avgt 50 303.381 ? 5.252 ns/op
>>
>>
>>
>> #### Current JDK after optimize
>>
>> Benchmark Mode Cnt Score Error Units
>> JmhDecimalFormat.testFormatOnly avgt 50 351.499 ? 0.761 ns/op
>> JmhDecimalFormat.testNewAndFormat avgt 50 615.145 ? 2.478 ns/op
>> JmhDecimalFormat.testNewOnly avgt 50 209.874 ? 9.951 ns/op
>>
>>
>> ### JDK 11
>>
>> Benchmark Mode Cnt Score Error Units
>> JmhDecimalFormat.testFormatOnly avgt 50 364.214 ? 1.191 ns/op
>> JmhDecimalFormat.testNewAndForma...
>
> lingjun-cg has updated the pull request incrementally with one additional commit since the last revision:
>
> 8333396: Performance regression of new DecimalFormat and DecimalFormat.format
Hi naotaoj,
What I mean "performance regression" is compare to JDK 11. We have an server side application that use DecimalFormat.format API seriously. When migrate it from JDK 11 to JDK 21, we found a performance degradation.
So I write the JMH test case "JmhDecimalFormat". It show that there a performance regression since JDK 21.
These are the perfasm output for running JMH on both JDK 11 and JDK21. There are some hot regions around the atomic instructions in JDK 21, but no such problem in JDK 11.
[jdk11.txt](https://github.com/user-attachments/files/15541278/jdk11.txt)
[jdk21.txt](https://github.com/user-attachments/files/15541279/jdk21.txt)
Maybe the [JEP 374: Deprecate and Disable Biased Locking](https://openjdk.org/jeps/374) is the reason?
So I run the benchmark on JDK 11 again but with option '-XX:-UseBiasedLocking', there only a minor gap between jdk11 and jdk 21.
OK, return to my patch. java.text use StringBuffer internally, but nearly all methods in StringBuffer are synchronized:
@Override
public synchronized StringBuffer append(Object obj) {
....
}
@Override
@IntrinsicCandidate
public synchronized StringBuffer append(String str) {
...
}
>From the above analysis, the atomic instructions slow down DecimalFormat.format, and StringBuffer's synchronized methods generate there atomic instructions. So If remove these synchronized methods, it will get a better performance.
So I replace StringBuffer with StringBuilder in java.text.NumberFormat.
public final String format(double number) {
// Use fast-path for double result if that works
String result = fastFormat(number);
if (result != null)
return result;
-return format(number, new StringBuffer(),
+return format(number, new StringBuilder(),
DontCareFieldPosition.INSTANCE).toString();
}
@@ -367,7 +367,7 @@ public final String format(double number) {
* @see java.text.Format#format
*/
public final String format(long number) {
- return format(number, new StringBuffer(),
+ return format(number, new StringBuilder(),
DontCareFieldPosition.INSTANCE).toString();
}
> Separately, please split this PR into two, as combining two different issues into a single JBS issue/PR is not right. The second issue is likely due to loading stream classes for the first time at JVM startup.
This is the flame graph compare to JDK 11 and JDK 21:
[flame-graph-jdk11-jdk21.zip](https://github.com/user-attachments/files/15541397/flame-graph-jdk11-jdk21.zip)
OK. I will create a separate issue.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19513#issuecomment-2146401596
More information about the i18n-dev
mailing list