RFR: 8356027: Print enhanced compilation timings

Aleksey Shipilev shade at openjdk.org
Thu May 1 16:59:57 UTC 2025


On Thu, 1 May 2025 12:35:38 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include:
>  1. Time spent before queuing: shows the compilation queue bottlenecks
>  2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load
>  3. Time spent actually compiling: shows the per-method compilation costs
> 
> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028).
> 
> The difference from the output format we ship in Leyden:
>  1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such.
>  2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations.
> 
> See the sample `-XX:+PrintCompilation` output in the comments.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `compiler`
>  - [ ]  Linux x86_64 server fastdebug, `all`

Sample log after:


$ build/linux-x86_64-server-release/images/jdk/bin/java -XX:+PrintCompilation Hello.java
...
    344      0                   1351       3       java.util.Arrays::hashCode (15 bytes)   started
    345      0                   1350       3       java.util.Objects::hash (5 bytes)   started
    345      0                   1352       3       jdk.internal.util.ArraysSupport::hashCode (42 bytes)   started
    345      0      37      79   1349       3       java.lang.Byte::<init> (10 bytes)
    345      0                   1353       3       java.lang.Byte::hashCode (8 bytes)   started
    345      0      77      92   1351       3       java.util.Arrays::hashCode (15 bytes)
    345      0                   1354       3       java.lang.Byte::hashCode (2 bytes)   started
    345      0     212      66   1353       3       java.lang.Byte::hashCode (8 bytes)
    345      0     276      38   1354       3       java.lang.Byte::hashCode (2 bytes)
    345      0     139     167   1352       3       jdk.internal.util.ArraysSupport::hashCode (42 bytes)
    345      0      79     354   1348       3       java.lang.invoke.MemberName::hashCode (43 bytes)
    345      0     123     107   1350       3       java.util.Objects::hash (5 bytes)
...


This shows, for example, that C1 (tier3) compilations of these small methods are really quick, and the queueing delays are comparable to the actual compilation costs.

Caught a failure in `compiler/inlining/LateInlinePrinting.java` -- we apparently rely in `-XX:+PrintInlining` to print the task before compilation, and then print inline tree after the compilation! With new `started` message there is a test error in matching. We also (reasonably) hold no `ttyLock` in between, so inline tree is not attributed well to the particular method if there are multiple compiler threads.

This is a blessing in disguise: with this change, we finally can start printing the inline tree after successful compiles under the `ttyLock` now. This also does `-XX:+PrintInlining` output only when `-XX:+PrintCompilation` is supplied (otherwise, where are you inlining into?), which I think is reasonable. Another test needs adjustments for this.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2844782421
PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2845231746


More information about the hotspot-dev mailing list