RFR: 8333893: Optimization for StringBuilder append boolean & null [v4]
Emanuel Peter
epeter at openjdk.org
Thu Jun 13 07:39:13 UTC 2024
On Wed, 12 Jun 2024 15:58:59 GMT, Shaojin Wen <duke at openjdk.org> wrote:
>>> @eme64 It seems like MergeStore didn't happen, is there something I did wrong?
>>
>> Yes ;)
>>
>> @wenshao The issue is that the pattern matching is quite **limited**.
>> `putCharsAt(byte[] value, int i, char c1, char c2, char c3, char c4)`
>> This has 4 variables. This is not allowed for the optimization. You would have to pass in a `long` and split it via shifts.
>> A possible exception: if `putCharsAt` is inlined and outside it is a splitting of a `long` value with shift, then ... maybe ... this would be optimized with `MergeStores`.
>>
>> There are only 2 **accepted** patterns:
>> - Multiple stores of constants -> constant is merged, which allows stores to be merged.
>> - Multiple stores which all store a "section" of a larger value:
>>
>> You can see **examples** I listed at the beginning of the PR description in https://github.com/openjdk/jdk/pull/16245.
>
> Thanks to @eme64 's patience and help, I found a way to use MergeStore without doing boundary checking.
>
> It would be even better if Unsafe.putChar could be used for MergeStore in the future.
>
> ## 1. JavaCode
>
> class StringUTF16 {
> public static void putCharsAt(byte[] value, int i, char c1, char c2, char c3, char c4) {
> // Don't use the putChar method, Its instrinsic will cause C2 unable to combining values into larger stores.
> putChar1(value, i , c1);
> putChar1(value, i + 1, c2);
> putChar1(value, i + 2, c3);
> putChar1(value, i + 3, c4);
> }
>
> public static void putCharsAt(byte[] value, int i, char c1, char c2, char c3, char c4, char c5) {
> // Don't use the putChar method, Its instrinsic will cause C2 unable to combining values into larger stores.
> putChar1(value, i , c1);
> putChar1(value, i + 1, c2);
> putChar1(value, i + 2, c3);
> putChar1(value, i + 3, c4);
> putChar1(value, i + 4, c5);
> }
>
> static void putChar1(byte[] value, int i, char c) {
> int address = Unsafe.ARRAY_BYTE_BASE_OFFSET + (i << 1);
> Unsafe.getUnsafe().putByte(value, address , (byte)(c >> HI_BYTE_SHIFT));
> Unsafe.getUnsafe().putByte(value, address + 1, (byte)(c >> LO_BYTE_SHIFT));
> }
> }
>
>
> ## 2. Benchmark Numbers
> The performance numbers under MacBookPro M1 Max are as follows:
>
> -Benchmark Mode Cnt Score Error Units
> -StringBuilders.toStringCharWithNull8Latin1 avgt 15 7.073 ? 0.010 ns/op
> -StringBuilders.toStringCharWithNull8Utf16 avgt 15 9.298 ? 0.015 ns/op
> -StringBuilders.toStringCharWithBool8Latin1 avgt 15 7.387 ? 0.021 ns/op
> -StringBuilders.toStringCharWithBool8Utf16 avgt 15 9.615 ? 0.011 ns/op
>
> +Benchmark Mode Cnt Score Error Units
> +StringBuilders.toStringCharWithNull8Latin1 avgt 15 5.628 ? 0.017 ns/op +25.67%
> +StringBuilders.toStringCharWithNull8Utf16 avgt 15 6.873 ? 0.026 ns/op +35.28%
> +StringBuilders.toStringCharWithBool8Latin1 avgt 15 6.480 ? 0.077 ns/op +13.99%
> +StringBuilders.toStringCharWithBool8Utf16 avgt 15 7.456 ? 0.059 ns/op +28.95%
>
>
> ## 3. TraceMergeStores
>
> [TraceMergeStores]: Replace
> 65 StoreB === 54 7 64 12 [[ 87 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=4; unsafe Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=4; !jvms: StringUTF16::putChar1 @ bci:20 (li...
@wenshao I'm glad we figured it out a bit, and I hope you learned a thing or two :)
> It would be even better if Unsafe.putChar could be used for MergeStore in the future.
The `_putCharStringU` intrinsic uses `LibraryCallKit::inline_string_char_access`. It seems to generate IR nodes, I speculate it could be `StoreC`. We'd have to do more digging to see why that pattern is not accepted by `MergeStore`.
@wenshao I'm not sure if everything is right with the bounds checking, but I leave this to the library folks to review. What you will definately need is benchmarking not just on `M1`, but also `x86-64` and other `aarch64` architectures.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2164851377
More information about the core-libs-dev
mailing list