RFR: 8303078: Reduce allocations when pretty printing JCTree during compilation
Christoph Dreis
duke at openjdk.org
Wed Feb 22 15:35:11 UTC 2023
On Mon, 20 Feb 2023 14:54:18 GMT, Christoph Dreis <duke at openjdk.org> wrote:
> Hi,
>
> I've been recently optimizing a few project pipelines and always noticed that their compilation step is somewhat "expensive". Zooming into the issue always revealed Lombok to be a larger contributor. Unfortunately, I can't get rid of Lombok in this customer project (before you ask).
>
> Apparently, Lombok is printing the respective `JCTree` here and there to check for type matches. I can't imagine this to be super efficient, but that's how it is at the moment.
> <img width="948" alt="image" src="https://user-images.githubusercontent.com/6304496/220134737-bd9b508c-a908-448b-b0d1-e960db28b24a.png">
>
> Anyhow, regardless of the Lombok inefficiencies I think there are some optimization opportunities in the JDK itself.
>
> 1. Overall, `Pretty.visitSelect` accounts for 8-10% of the total allocations in this project. And among those there are StringBuilder allocations coming from the following:
>
>
> public void visitSelect(JCFieldAccess tree) {
> try {
> printExpr(tree.selected, TreeInfo.postfixPrec);
> // StringBuilder allocations hiding in here.
> print("." + tree.name);
> } catch (IOException e) {
> throw new UncheckedIOException(e);
> }
> }
>
>
> This PR splits the `print` calls into two separate ones to avoid this String concatenation.
>
> ...
> printExpr(tree.selected, TreeInfo.postfixPrec);
> print('.');
> print(tree.name);
> ...
>
>
> Secondly, the `print` method takes an `Object` which seems like a good fit for another (private?) variant of it that only takes a `char`. By this we would probably avoid any eventual boxing and avoid any conversion with `Convert.escapeUnicode(s.toString())` that seems superfluous for chars like `.`, ` `, or any braces like `(`, `{` etc.
>
> This is currently a draft PR as long as the scope is not clarified. It currently only includes the necessary changes that would optimize the particular use-case. But there are more cases where e.g. the new `char` variant could be used and/or any String concatenation could be split into separate `print` calls.
>
> Let me know what you think and if I should include the other cases as well. If you think this is worthwhile, I'd appreciate if this is is sponsored. (Including creating an issue as I can't do this myself apparently. I will of course squash everything together with the proper issue ID once available.)
>
> I've contributed before, so the CLA should be signed.
>
> Cheers,
> Christoph
With the patch applied I can reduce things for an internal product.
**Some specs about the project:**
~2500 classes
Average size: ~76 lines
Largest class: ~2500 lines
Mostly hand-written, but some auto-generated protobuf files (I'd say 90/10 split)
**CPU profile before**
<img width="684" alt="image" src="https://user-images.githubusercontent.com/6304496/220614136-782ba89d-3241-45d4-86a7-68588592b3b1.png">
**CPU profile after**
<img width="684" alt="image" src="https://user-images.githubusercontent.com/6304496/220613899-a151a737-bded-45c5-919a-2de5989e6f5e.png">
**Allocation profile before**
<img width="684" alt="image" src="https://user-images.githubusercontent.com/6304496/220613178-22abc2d7-e18d-44e2-a249-0d6152884281.png">
**Allocation profile after**
<img width="684" alt="image" src="https://user-images.githubusercontent.com/6304496/220613351-16332e13-22f2-4004-96a3-bbef9ef039d0.png">
| Mode | Frames matched before | Frames matched after |
|--- | --- | --- |
|`cpu` | 2.27% | 1.42% |
| `alloc` | 7.09% | 4.13% |
In terms of timings I couldn't measure anything substantially on the actual project. Neither a regression, nor a substantial improvement. But it's not so much about time anyhow but about the allocations here. The overall goal is to reduce impact on GC that is using 27% of the CPU frames (to give you some more background). Obviously, this is not coming from the JDK alone and Lombok is a major contributor here, but the JDK can easy the impact of Lombok here I'd say.
<img width="473" alt="image" src="https://user-images.githubusercontent.com/6304496/220608498-65340f0f-9342-4df4-bdd5-9a83b05b0b92.png">
Admittedly, it's more on the smaller end of things and a smaller knob in terms of GC improvements.
Let me know if you need more information.
In terms of some more absolute allocation numbers:
**Before (TOP 3 => ~5,6GB)**
bytes percent samples top
---------- ------- ------- ---
3581028736 26.79% 10682 byte[]
1291159216 9.66% 3930 java.lang.Object[]
740065552 5.54% 2161 java.lang.String
**After (TOP 3 => ~5,4GB)**
bytes percent samples top
---------- ------- ------- ---
3376915824 26.33% 11615 byte[]
1284160152 10.01% 3847 java.lang.Object[]
730100712 5.69% 2023 java.lang.String
`StringBuilder` savings account for an additional 62MB
-------------
PR: https://git.openjdk.org/jdk/pull/12667
More information about the compiler-dev
mailing list