RFD: The Cost of Profiling in the HotSpot Virtual Machine

Andrew Haley aph-open at littlepinkcloud.com
Mon Feb 3 11:18:56 UTC 2025


This paper is available at
https://ckirsch.github.io/publications/proceedings/MPLR24.pdf#page=117

One thing that really stands out is the slowdown caused by multiple
threads racing to increment profile counters. While this may seem like
a theoretical concern, we have seen it in customers' real-world
situations. When an application spins up worker threads which all
start at the same time, the resulting memory traffic can substantially
delay application startup.

It would not be very difficult to fix problem this by using a very
simple implementation of distributed counters, but doing so would
generate (even) more code and would be slower in the single-threaded
case. I have created https://bugs.openjdk.org/browse/JDK-8348027 to
track this possibility.

But is it worth doing anything about this at all? You could argue that
any application that starts too many threads to soon is simply
misconfigured, but it's hard for Java users to diagnose what's
happening. Maybe Project Leyden will solve the problem in a better
way by removing the emphasis on warmup.

What do you think?

Thanks,

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



More information about the hotspot-dev mailing list