Problems with LTO for HotSpot
Julian Waters
tanksherman27 at gmail.com
Wed Sep 24 17:08:34 UTC 2025
Hi all,
Recently I've been picking up and resuming work on making LTO viable
for HotSpot. The aim is to have LTO as a working option available so
the benefits of enhanced optimization can be enjoyed by running Java
code, though making LTO the default is not really a goal, at least not
yet. The work has been going on decently well so far, but while
working on it, there have been 2 rather longstanding problems I
haven't really been able to solve. The first is related to
https://github.com/openjdk/jdk/pull/22864 switching
os::current_stack_pointer() to using runtime assembly, but we can
visit that later. The bigger issue is related to the flatten attribute
on the gcc compiler. In short, some G1 code (More specifically void
G1ParScanThreadState::trim_queue_to_threshold(uint threshold), void
G1ParScanThreadState::steal_and_trim_queue(G1ScannerTasksQueueSet*
task_queues) and oop
G1ParScanThreadState::copy_to_survivor_space(G1HeapRegionAttr
region_attr, oop old, markWord old_mark)) is marked as flatten, which
causes gcc to inline all calls inside those methods. This normally
would be fine since the compilation unit boundary prevents inlining
from across source files, but when LTO is active, the method bodies
from other compilation units become available, and gcc then goes on a
rampage, mass inlining everything it can find until there is nothing
left to inline. On top of causing the JVM inflate to at least 60MB in
the best case, it also causes build problems, notably JDK-8343698 and
(suspected) JDK-8334616 and in general LTO is extremely slow, likely
due to this problem. It would seem that we'd need to create NOINLINE
wrappers for methods which are called by the flattened code but are
not defined in the same source file (g1ParScanThreadState.cpp).
Problematically however, the call hierarchy for these 3 methods is
downright *massive* since these methods are absolute monsters. This
makes trying to find which methods these 3 call very tedious and error
prone. After months and many different approaches, all of which have
failed, I'm still no closer to finding out which code needs NOINLINE
wrappers to prevent cross compilation unit inlining. Might there be a
better way to figure out which methods are called from outside the
g1ParScanThreadState.cpp source file than my current approach of
manually looking through an IDE (Since all automated tools have failed
in one way or another)? This is a very big blocker for working LTO
with HotSpot, once I manage to get this and the stack pointer issue
solved I believe I'll (hopefully) be able to get working LTO into the
JDK soon.
best regards,
Julian
More information about the build-dev
mailing list