RFR: 8360557: CTW: Inline cold methods to reach more code
Aleksey Shipilev
shade at openjdk.org
Tue Jul 1 12:46:38 UTC 2025
On Tue, 1 Jul 2025 12:26:44 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
> We use CTW testing for making sure compilers behave well. But we compile the code that is not executed at all, and since our inlining heuristics often looks back at profiles, we end up not actually inlining all too much! This means CTW testing likely misses lots of bugs that normal code is exposed to, especially e.g. in loop optimizations.
>
> There is an intrinsic tradeoff with accepting more inilned methods in CTW: the compilation time gets significantly worse. With just accepting the cold methods we have reasonable CTW times, eating the improvements we have committed in mainline recently. And it still finds bugs. See the RFE for sample data.
>
> After this lands and CTW starts to compile cold methods, one can greatly expand the scope of the CTW testing by overriding the static inlining limits. Doing e.g. `TEST_VM_OPTS="-XX:MaxInlineSize=70 -XX:C1MaxInlineSize=70"` finds even more bugs. Unfortunately, the compilation times suffer so much, they are impractical to run in standard configurations, see data in RFE. We will enable some of that testing in special testing pipelines.
>
> Pre-empting the question: "Well, why not use -Xcomp then, and make sure it inlines well?" The answer is in RFE as well: Xcomp causes _a lot_ of stray compilations for JDK and CTW infra itself. For small JARs in large corpus this eats precious testing time that we would instead like to spend on deeper inlining in the actual JAR code. This also does not force us to look into how CTW works in Xcomp at all; I expect some surprises there. Feather-touching the inlining heuristic paths to just accept methods without looking at profiles looks better.
>
> Tobias had an idea to implement the stress randomized inlining that would expand the scope of inlining. This improvement stacks well with it. This improvement provides the base case of inlining most reasonable methods, and then allow stress infra to inline some more on top of that.
>
> Additional testing:
> - [ ] GHA
> - [x] Linux x86_64 server fastdebug, `applications/ctw/modules`
> - [x] Linux x86_64 server fastdebug, large CTW corpus (now failing in interesting ways)
We are on par for CTW testing time, comparing to the state a week back:
# Before CTW perf improvements
real 5m0.528s
user 79m5.193s
sys 14m16.678s
# Current mainline
real 3m59.274s
user 68m9.663s
sys 5m19.026s
# This PR
real 4m56.248s
user 89m48.364s
sys 5m24.091s
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26068#issuecomment-3023863192
More information about the hotspot-compiler-dev
mailing list