RFR: 8360557: CTW: Inline cold methods to reach more code

Tue Jul 1 12:46:38 UTC 2025

On Tue, 1 Jul 2025 12:26:44 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> We use CTW testing for making sure compilers behave well. But we compile the code that is not executed at all, and since our inlining heuristics often looks back at profiles, we end up not actually inlining all too much! This means CTW testing likely misses lots of bugs that normal code is exposed to, especially e.g. in loop optimizations.
> 
> There is an intrinsic tradeoff with accepting more inilned methods in CTW: the compilation time gets significantly worse. With just accepting the cold methods we have reasonable CTW times, eating the improvements we have committed in mainline recently. And it still finds bugs. See the RFE for sample data.
> 
> After this lands and CTW starts to compile cold methods, one can greatly expand the scope of the CTW testing by overriding the static inlining limits. Doing e.g. `TEST_VM_OPTS="-XX:MaxInlineSize=70 -XX:C1MaxInlineSize=70"` finds even more bugs. Unfortunately, the compilation times suffer so much, they are impractical to run in standard configurations, see data in RFE. We will enable some of that testing in special testing pipelines.
> 
> Pre-empting the question: "Well, why not use -Xcomp then, and make sure it inlines well?" The answer is in RFE as well: Xcomp causes _a lot_ of stray compilations for JDK and CTW infra itself. For small JARs in large corpus this eats precious testing time that we would instead like to spend on deeper inlining in the actual JAR code. This also does not force us to look into how CTW works in Xcomp at all; I expect some surprises there. Feather-touching the inlining heuristic paths to just accept methods without looking at profiles looks better.
> 
> Tobias had an idea to implement the stress randomized inlining that would expand the scope of inlining. This improvement stacks well with it. This improvement provides the base case of inlining most reasonable methods, and then allow stress infra to inline some more on top of that.
> 
> Additional testing:
>  - [ ] GHA
>  - [x] Linux x86_64 server fastdebug, `applications/ctw/modules`
>  - [x] Linux x86_64 server fastdebug, large CTW corpus (now failing in interesting ways)

We are on par for CTW testing time, comparing to the state a week back:

# Before CTW perf improvements
real	5m0.528s
user	79m5.193s
sys	14m16.678s

# Current mainline
real	3m59.274s
user	68m9.663s
sys	5m19.026s

# This PR
real	4m56.248s
user	89m48.364s
sys	5m24.091s

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26068#issuecomment-3023863192