RFR: 8342975: C2: Micro-optimize PhaseIdealLoop::Dominators() [v2]
Aleksey Shipilev
shade at openjdk.org
Fri Oct 25 05:52:39 UTC 2024
> Noticed this while looking at Leyden profiles. C2 seems to spend considerable time doing in this loop. The disassembly shows this loop is fairly hot. Replacing the initialization with memset, while touching more memory, is apparently faster. memset is also what we normally do around C2 for arena-allocated data. We seem to touch a lot of these structs later on, so pulling them to cache with memset is likely "free".
>
> It also looks like current initialization misses initializing the last element (at `C->unique()+1`).
>
> I'll put performance data in separate comment.
Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision:
Better comment
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/21690/files
- new: https://git.openjdk.org/jdk/pull/21690/files/9a5953d4..981c9649
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=21690&range=01
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=21690&range=00-01
Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
Patch: https://git.openjdk.org/jdk/pull/21690.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/21690/head:pull/21690
PR: https://git.openjdk.org/jdk/pull/21690
More information about the hotspot-compiler-dev
mailing list