RFR: 8350488: [leyden] Experimental AOT-only mode
Aleksey Shipilev
shade at openjdk.org
Fri Feb 21 11:03:08 UTC 2025
On Fri, 21 Feb 2025 10:36:19 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
> There are interesting use cases where we want the AOT-only mode. We can emulate this in current Leyden prototype by relying on preload code, and stopping any profiling, which would naturally lead to no JIT compilations. This would also make interpreter code a bit faster in case we need to fall back there. This mode looks also helpful for studying the compiler dynamics.
>
> Additional testing:
> - [x] Eyeballing compilation logs with `-XX:+PreloadOnly`
> - [ ] Linux x86_64 server fastdebug, `runtime/cds`
> - [ ] Linux x86_64 server fastdebug, `runtime/cds` with `-XX:+PreloadOnly`
Sample performance results, after pre-training `JavacBenchApp` with 200x iterations:
# === 32 cores
# Default
Time (mean ± σ): 403.4 ms ± 6.5 ms [User: 1082.1 ms, System: 167.6 ms]
Range (min … max): 387.6 ms … 417.7 ms 30 runs
# -XX:+PreloadOnly
Time (mean ± σ): 439.1 ms ± 5.3 ms [User: 544.2 ms, System: 84.2 ms]
Range (min … max): 430.3 ms … 456.2 ms 30 runs
# === 2 cores
# Default
Time (mean ± σ): 531.8 ms ± 30.1 ms [User: 870.9 ms, System: 114.3 ms]
Range (min … max): 479.6 ms … 606.8 ms 30 runs
# -XX:+PreloadOnly
Time (mean ± σ): 425.8 ms ± 6.4 ms [User: 530.2 ms, System: 76.4 ms]
Range (min … max): 418.9 ms … 451.4 ms 30 runs
In both cases, "user" time goes down because we have no additional code load / JIT compilations. In 32-core case, we can see that peak performance suffers a bit, since preload code is not 100% efficient. The combination of these two factors is a net benefit in 2-core case: not doing JIT compilations more than pays for preload code inefficiency. This tradeoff of course depends on how well-trained the scenario is.
-------------
PR Comment: https://git.openjdk.org/leyden/pull/44#issuecomment-2674246858
More information about the leyden-dev
mailing list