RFR: 8345265: Minor improvements for LTO across all compilers [v2]

Fri Feb 28 16:37:01 UTC 2025

On Tue, 18 Feb 2025 13:44:56 GMT, Matthias Baesken <mbaesken at openjdk.org> wrote:

> > @MBaesken Currently with LTO active on gcc 14 commit [e648a90](https://github.com/openjdk/jdk/commit/e648a907b31fd0d6b746d149fda2a8d5fbe26dc0) is causing serious trouble on my end by mass inlining everything, bloating the JVM to nearly 60MB in size, does HotSpot have the same size issues on your end with LTO? (--enable-jvm-feature-opt-size is off the table because the JVM should ideally be an acceptable size even without that flag, and -Os and LTO doesn't work with gcc anyway)
> 
> On my end we used gcc11 in the past and now test gcc13. Both work nicely, no libjvm.so bloat has been observed with lto. Maybe there is some issue/difference with gcc14 but so far we did not test with this version.

Leaving Kim's comment about flattening in here, as I believe something has changed with the flatten attribute in gcc 14 that made it far more aggressive across compilation units, so this is probably relevant. A simple test of symbol size with nm strongly supports this theory

> G1ParScanThreadState uses ATTRIBUTE_FLATTEN to tune the inlining of code in that class, in an attempt to ensure the desired fast paths are inlined, despite the size and other attributes of some of these functions that might (and empirically did) inhibit inlining in some critical places with at least some compilers. It also uses NOINLINE and assumed implicit non-inlining of definitions in other translation units to keep slower paths out of line, to limit code size. It looks like there are a couple of things going on here.
> 
> One is that slow paths internal to the Stack implementation (and perhaps other places) are being flattened because there isn't any NOINLINE anywhere to prevent it and it's all template code, so source is not in another translation unit. So we're probably generating more inline code that we really want. That seems hard to avoid though.
> 
> The other is that LTO seems to be applying flattening even across translation units. (That's not completely surprising.) So the assumption that the flattening won't apply to code in other TU's is invalidated by LTO. That's going to make flatting a lot harder to use.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22464#issuecomment-2691081602