RFR: 8356027: Print enhanced compilation timings
Aleksey Shipilev
shade at openjdk.org
Thu May 1 16:59:57 UTC 2025
On Thu, 1 May 2025 12:35:38 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
> In Leyden, we have the extended compilation timings printouts with -XX:+PrintCompilation / UL, that are very useful to study compiler dynamics. These timings include:
> 1. Time spent before queuing: shows the compilation queue bottlenecks
> 2. Time spent in the queue: shows delays caused by queue bottlenecks and compiler load
> 3. Time spent actually compiling: shows the per-method compilation costs
>
> We should consider the same kind of printout for mainline. This would also require us to print the compilation task _after_ the compilation, not only before it. This improvement would also obviate any need for `PrintCompilation2` flag, [JDK-8356028](https://bugs.openjdk.org/browse/JDK-8356028).
>
> The difference from the output format we ship in Leyden:
> 1. This output prints before/after the compilation to maintain old behavior partially. The "before" printout is now prepended with `started` to clearly mark it as such.
> 2. The output is raw number in microseconds. In Leyden repo, we have these prepended with characters, like `C0.1`, but that prepending makes it a bit inconvenient with scripts. This PR also does microseconds, instead of fractional milliseconds. This should be enough to capture the wide range of durations.
>
> See the sample `-XX:+PrintCompilation` output in the comments.
>
> Additional testing:
> - [x] Linux x86_64 server fastdebug, `compiler`
> - [ ] Linux x86_64 server fastdebug, `all`
Sample log after:
$ build/linux-x86_64-server-release/images/jdk/bin/java -XX:+PrintCompilation Hello.java
...
344 0 1351 3 java.util.Arrays::hashCode (15 bytes) started
345 0 1350 3 java.util.Objects::hash (5 bytes) started
345 0 1352 3 jdk.internal.util.ArraysSupport::hashCode (42 bytes) started
345 0 37 79 1349 3 java.lang.Byte::<init> (10 bytes)
345 0 1353 3 java.lang.Byte::hashCode (8 bytes) started
345 0 77 92 1351 3 java.util.Arrays::hashCode (15 bytes)
345 0 1354 3 java.lang.Byte::hashCode (2 bytes) started
345 0 212 66 1353 3 java.lang.Byte::hashCode (8 bytes)
345 0 276 38 1354 3 java.lang.Byte::hashCode (2 bytes)
345 0 139 167 1352 3 jdk.internal.util.ArraysSupport::hashCode (42 bytes)
345 0 79 354 1348 3 java.lang.invoke.MemberName::hashCode (43 bytes)
345 0 123 107 1350 3 java.util.Objects::hash (5 bytes)
...
This shows, for example, that C1 (tier3) compilations of these small methods are really quick, and the queueing delays are comparable to the actual compilation costs.
Caught a failure in `compiler/inlining/LateInlinePrinting.java` -- we apparently rely in `-XX:+PrintInlining` to print the task before compilation, and then print inline tree after the compilation! With new `started` message there is a test error in matching. We also (reasonably) hold no `ttyLock` in between, so inline tree is not attributed well to the particular method if there are multiple compiler threads.
This is a blessing in disguise: with this change, we finally can start printing the inline tree after successful compiles under the `ttyLock` now. This also does `-XX:+PrintInlining` output only when `-XX:+PrintCompilation` is supplied (otherwise, where are you inlining into?), which I think is reasonable. Another test needs adjustments for this.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2844782421
PR Comment: https://git.openjdk.org/jdk/pull/24984#issuecomment-2845231746
More information about the hotspot-dev
mailing list