<i18n dev> RFR: 8333396: Performance regression of DecimalFormat.format [v8]

lingjun-cg duke at openjdk.org
Wed Jun 19 02:08:16 UTC 2024


On Tue, 18 Jun 2024 10:00:33 GMT, lingjun-cg <duke at openjdk.org> wrote:

>> ### Performance regression of DecimalFormat.format
>> From the output of perf, we can see the hottest regions contain atomic instructions.  But when run with JDK 11, there is no such problem. The reason is the removed biased locking.  
>> The DecimalFormat uses StringBuffer everywhere, and StringBuffer itself contains many synchronized methods.
>> So I added support for some new methods that accept StringBuilder which is lock-free.
>> 
>> ### Benchmark testcase
>> 
>> @BenchmarkMode(Mode.AverageTime)
>> @Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
>> @State(Scope.Thread)
>> @OutputTimeUnit(TimeUnit.NANOSECONDS)
>> public class JmhDecimalFormat {
>> 
>>     private DecimalFormat format;
>> 
>>     @Setup(Level.Trial)
>>     public void setup() {
>>         format = new DecimalFormat("#0.00000");
>>     }
>> 
>>     @Benchmark
>>     public void testNewAndFormat() throws InterruptedException {
>>         new DecimalFormat("#0.00000").format(9524234.1236457);
>>     }
>> 
>>     @Benchmark
>>     public void testNewOnly() throws InterruptedException {
>>         new DecimalFormat("#0.00000");
>>     }
>> 
>>     @Benchmark
>>     public void testFormatOnly() throws InterruptedException {
>>         format.format(9524234.1236457);
>>     }
>> }
>> 
>> 
>> ### Test result
>> #### Current JDK before optimize
>> 
>>  Benchmark                             Mode  Cnt    Score   Error  Units
>> JmhDecimalFormat.testFormatOnly       avgt   50  642.099 ? 1.253  ns/op
>> JmhDecimalFormat.testNewAndFormat     avgt   50  989.307 ? 3.676  ns/op
>> JmhDecimalFormat.testNewOnly          avgt   50  303.381 ? 5.252  ns/op
>> 
>> 
>> 
>> #### Current JDK after optimize
>> 
>> Benchmark                          Mode  Cnt    Score   Error  Units
>> JmhDecimalFormat.testFormatOnly    avgt   50  351.499 ? 0.761  ns/op
>> JmhDecimalFormat.testNewAndFormat  avgt   50  615.145 ? 2.478  ns/op
>> JmhDecimalFormat.testNewOnly       avgt   50  209.874 ? 9.951  ns/op
>> 
>> 
>> ### JDK 11 
>> 
>> Benchmark                          Mode  Cnt    Score   Error  Units
>> JmhDecimalFormat.testFormatOnly    avgt   50  364.214 ? 1.191  ns/op
>> JmhDecimalFormat.testNewAndFormat  avgt   50  658.699 ? 2.311  ns/op
>> JmhDecimalFormat.testNewOnly       avgt   50  248.300 ? 5.158  ns/op
>
> lingjun-cg has updated the pull request incrementally with one additional commit since the last revision:
> 
>   8333396: Performance regression of DecimalFormat.format

Using `<T extends Appendable & CharSequence> ` instead of  `StringBuf` seems a good idea, but it has a disadvantage. The `Appendable` defines only 3 `append` methods which cannot cover some cases.  For example, this method `SimpleDateFormat#format(Date, StringBuffer, FieldDelegate)` contains the following code:


            case TAG_QUOTE_CHARS:
                toAppendTo.append(compiledPattern, i, count);
                i += count;
                break;


It requires `append(char[], int, int)`, but the `Appendable` has no such method.
So it's difficult to use  `<T extends Appendable & CharSequence> ` in methods `String DateFormat.format(Date date), String ListFormat.format(List<String> input) and, String MessageFormat.format(Object obj)`.

The method in `java.text.MessageFormat#subformat` contains the following code:

            int argumentNumber = argumentNumbers[i];
            if (arguments == null || argumentNumber >= arguments.length) {
                result.append('{').append(argumentNumber).append('}');
                continue;
            }

It requires `append(int)`, but the `Appendable` has no such method.

To sum up
1. `<T extends Appendable & CharSequence> ` is clear, but lack of extensibility
2. StringBuf is not so clear, but it's more extensibility. We can add `append(char[], int, int)` to `StringBuf` to support `SimpleDateFormat`.
3.  New interface StringBuf no need to handle IOException, but use `<T extends Appendable & CharSequence> ` need to add a lot of try...catch statements around the append method.
4. 
So it seems `StringBuf` is better if we want to fix the problem in `DateFormat,ListFormat,MessageFormat` in unique way. 
I look forward to your reply. @naotoj @liach @justin-curtis-lu

Using `<T extends Appendable & CharSequence> ` instead of  `StringBuf` seems a good idea, but it has a disadvantage. The `Appendable` defines only 3 `append` methods which cannot cover some cases.  For example, this method `SimpleDateFormat#format(Date, StringBuffer, FieldDelegate)` contains the following code:


            case TAG_QUOTE_CHARS:
                toAppendTo.append(compiledPattern, i, count);
                i += count;
                break;


It requires `append(char[], int, int)`, but the `Appendable` has no such method.
So it's difficult to use  `<T extends Appendable & CharSequence> ` in methods `String DateFormat.format(Date date), String ListFormat.format(List<String> input) and, String MessageFormat.format(Object obj)`.

The method in `java.text.MessageFormat#subformat` contains the following code:

            int argumentNumber = argumentNumbers[i];
            if (arguments == null || argumentNumber >= arguments.length) {
                result.append('{').append(argumentNumber).append('}');
                continue;
            }

It requires `append(int)`, but the `Appendable` has no such method.

To sum up
1. `<T extends Appendable & CharSequence> ` is clear, but lack of extensibility
2. StringBuf is not so clear, but it's more extensibility. We can add `append(char[], int, int)` to `StringBuf` to support `SimpleDateFormat`.
3.  New interface StringBuf no need to handle IOException, but use `<T extends Appendable & CharSequence> ` need to add a lot of try...catch statements around the append method.


So it seems `StringBuf` is better if we want to fix the problem in `DateFormat,ListFormat,MessageFormat` in unique way. 
I look forward to your reply. @naotoj @liach @justin-curtis-lu

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19513#issuecomment-2177382329
PR Comment: https://git.openjdk.org/jdk/pull/19513#issuecomment-2177383927


More information about the i18n-dev mailing list