RFR: 8333893: Optimization for StringBuilder append boolean & null [v9]

Shaojin Wen duke at openjdk.org
Fri Jun 14 08:55:16 UTC 2024


On Fri, 14 Jun 2024 01:17:29 GMT, Shaojin Wen <duke at openjdk.org> wrote:

>> After PR https://github.com/openjdk/jdk/pull/16245, C2 optimizes stores into primitive arrays by combining values ​​into larger stores.
>> 
>> This PR rewrites the code of appendNull and append(boolean) methods so that these two methods can be optimized by C2.
>
> Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   optimization for x64

The performance regression issue under x64 has been resolved. The following code will affect the running performance under x64:
* x64 slow version

int count = this.count;
ensureCapacityInternal(count + 4);


* x64 faster version, WebRev 08: [Full](https://webrevs.openjdk.org/?repo=jdk&pr=19626&range=08) - [Incremental](https://webrevs.openjdk.org/?repo=jdk&pr=19626&range=07-08) ([b5ad8e70](https://git.openjdk.org/jdk/pull/19626/files/b5ad8e70928c547d134d0e4a532441cad9a7e4a2))

ensureCapacityInternal(count + 4);
int count = this.count;


I don't know why, but this solved the problem.

The performance numbers for various platforms are as follows

## aliyun ecs.c8a
* CPU AMD EPYCTM Genoa
* Platform x64

-Benchmark                             Mode  Cnt   Score   Error  Units #master (a6fc2f8)
-StringBuilders.appendWithBool8Latin1  avgt   15   7.270 ± 0.013  ns/op
-StringBuilders.appendWithBool8Utf16   avgt   15  10.860 ± 0.044  ns/op
-StringBuilders.appendWithNull8Latin1  avgt   15  5.490  ± 0.007  ns/op
-StringBuilders.appendWithNull8Utf16   avgt   15  26.203 ± 0.294  ns/op

+Benchmark                             Mode  Cnt   Score   Error  Units # Web 08 (b5ad8e70)
+StringBuilders.appendWithBool8Latin1  avgt   15   6.035 ± 0.062  ns/op +20%
+StringBuilders.appendWithBool8Utf16   avgt   15  10.072 ± 0.032  ns/op +7.82%
+StringBuilders.appendWithNull8Latin1  avgt   15   5.491 ± 0.021  ns/op -0.01$
+StringBuilders.appendWithNull8Utf16   avgt   15   7.701 ± 0.036  ns/op +240.25%


## aliyun ecs.c8i
* CPU Intel® Xeon® Emerald
* Platform x64

-Benchmark                             Mode  Cnt   Score   Error  Units #master (a6fc2f8)
-StringBuilders.appendWithBool8Latin1  avgt   15   7.466 ± 0.012  ns/op
-StringBuilders.appendWithBool8Utf16   avgt   15  15.220 ± 0.005  ns/op
-StringBuilders.appendWithNull8Latin1  avgt   15  22.084 ± 0.331  ns/op
-StringBuilders.appendWithNull8Utf16   avgt   15  25.717 ± 0.106  ns/op

+Benchmark                             Mode  Cnt   Score   Error  Units # Web 08 (b5ad8e70)
+StringBuilders.appendWithBool8Latin1  avgt   15   6.706 ± 0.011  ns/op +11.33%
+StringBuilders.appendWithBool8Utf16   avgt   15  11.610 ± 0.140  ns/op +31.09%
+StringBuilders.appendWithNull8Latin1  avgt   15  19.345 ± 0.050  ns/op +14.15%
+StringBuilders.appendWithNull8Utf16   avgt   15  22.765 ± 0.045  ns/op +12.96%


## MacBook M1 Max
* CPU Apple M1 Max
* Platform aarch64

-Benchmark                             Mode  Cnt  Score   Error  Units #master (a6fc2f8)
-StringBuilders.appendWithBool8Latin1  avgt   15  7.647 ? 0.014  ns/op
-StringBuilders.appendWithBool8Utf16   avgt   15  9.620 ? 0.022  ns/op
-StringBuilders.appendWithNull8Latin1  avgt   15  7.208 ? 0.127  ns/op
-StringBuilders.appendWithNull8Utf16   avgt   15  9.455 ? 0.167  ns/op

+Benchmark                             Mode  Cnt  Score   Error  Units # Web 08 (b5ad8e70)
+StringBuilders.appendWithBool8Latin1  avgt   15  5.932 ? 0.026  ns/op +28.91%
+StringBuilders.appendWithBool8Utf16   avgt   15  7.432 ? 0.015  ns/op +29.44$
+StringBuilders.appendWithNull8Latin1  avgt   15  5.461 ? 0.015  ns/op +31.99%
+StringBuilders.appendWithNull8Utf16   avgt   15  6.966 ? 0.015  ns/op +35.73%


## aliyun ecs.c8y
* CPU Yitian710 aarch64
* Platform aarch64

-Benchmark                             Mode  Cnt   Score   Error  Units
-StringBuilders.appendWithBool8Latin1  avgt   15  12.127 ± 1.241  ns/op
-StringBuilders.appendWithBool8Utf16   avgt   15  20.954 ± 1.668  ns/op
-StringBuilders.appendWithNull8Latin1  avgt   15  10.838 ± 0.244  ns/op
-StringBuilders.appendWithNull8Utf16   avgt   15  14.391 ± 0.063  ns/op


+Benchmark                             Mode  Cnt   Score   Error  Units
+StringBuilders.appendWithBool8Latin1  avgt   15   9.299 ± 0.079  ns/op +30.41%
+StringBuilders.appendWithBool8Utf16   avgt   15  11.688 ± 0.004  ns/op +79.27%
+StringBuilders.appendWithNull8Latin1  avgt   15   8.516 ± 0.258  ns/op +27.26%
+StringBuilders.appendWithNull8Utf16   avgt   15  11.020 ± 0.004  ns/op +30.58%


## aws.c5
* Platform aarch64

-Benchmark                             Mode  Cnt   Score   Error  Units
-StringBuilders.appendWithBool8Latin1  avgt   15  10.278 ± 0.071  ns/op
-StringBuilders.appendWithBool8Utf16   avgt   15  24.355 ± 0.418  ns/op
-StringBuilders.appendWithNull8Latin1  avgt   15  33.104 ± 0.589  ns/op
-StringBuilders.appendWithNull8Utf16   avgt   15  35.762 ± 7.580  ns/op

+Benchmark                             Mode  Cnt   Score   Error  Units
+StringBuilders.appendWithBool8Latin1  avgt   15   9.812 ± 0.075  ns/op
+StringBuilders.appendWithBool8Utf16   avgt   15  17.812 ± 0.006  ns/op
+StringBuilders.appendWithNull8Latin1  avgt   15  34.773 ± 2.897  ns/op
+StringBuilders.appendWithNull8Utf16   avgt   15  33.187 ± 1.604  ns/op

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2167566396


More information about the core-libs-dev mailing list