RFR: 8333893: Optimization for StringBuilder append boolean & null [v4]
Emanuel Peter
epeter at openjdk.org
Wed Jun 12 12:02:13 UTC 2024
On Tue, 11 Jun 2024 11:35:28 GMT, Shaojin Wen <duke at openjdk.org> wrote:
>> After PR https://github.com/openjdk/jdk/pull/16245, C2 optimizes stores into primitive arrays by combining values into larger stores.
>>
>> This PR rewrites the code of appendNull and append(boolean) methods so that these two methods can be optimized by C2.
>
> Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision:
>
> revert
Going back to:
./java -XX:+TraceMergeStores -Xbatch -XX:CompileCommand=printcompilation,*::* -XX:+PrintInlining AppendNullTest.java > log.log
We have occurances of `TraceMergeStores` in these compilations:
9139 1992 b 4 AppendNullTest$StringBuilder::appendNull (75 bytes)
-> as discussed above
9156 1996 % b 4 AppendNullTest::main @ 11 (45 bytes)
-> once main method is OSR compiled
529 StoreB === 513 408 527 464 [[ 554 540 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=9; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=9; !jvms: AppendNullTest$StringUTF16::putChar @ bci:15 (line 115) AppendNullTest$StringUTF16::putCharsAt @ bci:3 (line 87) AppendNullTest$StringBuilder::appendNull @ bci:63 (line 25) AppendNullTest::main @ bci:18 (line 123)
8 such StoreB get turned into StoreL. This also comes from the recursive inlining, ultimately from the same `putChar`.
1154 StoreB === 1139 3516 1152 464 [[ 1178 1164 ]] @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):exact+any *, idx=9; Memory: @byte[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=9; !jvms: AppendNullTest$StringUTF16::putChar @ bci:15 (line 115) AppendNullTest$StringUTF16::putCharsAt @ bci:3 (line 87) AppendNullTest$StringBuilder::appendNull @ bci:63 (line 25) AppendNullTest::main @ bci:21 (line 123)
Now the pattern repeats, because you have multiple `appendNull` calls in the `main` melthod.
So a total of 5 such optimizations, since all the methods are inlined multiple times.
That is all the optimizations that I see, where `MergeStores` happens. Now the question is if there are any compilations where we would expect the `MergeStores` to happen but they do not. That is more difficult.
I can for example see a previous compilation of the `main` method:
9151 1995 b 3 AppendNullTest::main (45 bytes)
@ 5 AppendNullTest$StringBuilder::<init> (26 bytes) inline
@ 1 java.lang.Object::<init> (1 bytes) inline
@ 18 AppendNullTest$StringBuilder::appendNull (75 bytes) failed to inline: callee is too large
@ 21 AppendNullTest$StringBuilder::appendNull (75 bytes) failed to inline: callee is too large
@ 24 AppendNullTest$StringBuilder::appendNull (75 bytes) failed to inline: callee is too large
@ 27 AppendNullTest$StringBuilder::appendNull (75 bytes) failed to inline: callee is too large
@ 30 AppendNullTest$StringBuilder::appendNull (75 bytes) failed to inline: callee is too large
@ 35 AppendNullTest$StringBuilder::clear (3 bytes) inline
I think it is obvious: if methods are not inlined, then we cannot optimize the code there. Instead, this is going to be method calls, where we call `appendNull` - with the overhead of a call of course. That could be another source of performance loss. Could be - it depends how often we execute this compiled code at runtime.
Of course `appendNull` is already comiled and optimized, so we would still expect to have `long` stores instead of `byte` stores.
So far goes the "theory". Now we need to test the theory with some performance measurements.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19626#issuecomment-2162828575
More information about the hotspot-compiler-dev
mailing list