RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences [v5]
Manuel Hässig
mhaessig at openjdk.org
Tue Jun 17 09:37:56 UTC 2025
> ## Summary
>
> On x86, chained dereferences of narrow oops at a constant offset from the base oop can use a `lea` instruction to perform the address computation in one go using the `leaP8Narrow`, `leaP32Narrow`, and `leaPCompressedOopOffset` matching rules. However, the generated code contains an additional `lea` with an unused result:
>
> ; OptoAssembly
> 03d decode_heap_oop_not_null R8,R10
> 041 leaq R10, [R12 + R10 << 3 + #12] (compressed oop addressing) ; ptr compressedoopoff32
>
> ; x86
> 0x00007f1f210625bd: lea (%r12,%r10,8),%r8 ; result is unused
> 0x00007f1f210625c1: lea 0xc(%r12,%r10,8),%r10 ; the same computation as decode, but with offset
>
>
> This PR adds a peephole optimization to remove such redundant `lea`s.
>
> ## The Issue in Detail
>
> The ideal subgraph producing redundant `lea`s, or rather redundant `decodeHeapOop_not_null`s, is `LoadN -> DecodeN -> AddP`, where both the address and base edge of the `AddP` originate from the `DecodeN`. After matching, this becomes
>
> LoadN -> decodeHeapOop_not_null -> leaP*
> ______________________________Î
>
> where `leaP*` is either of `leaP8Narrow`, `leaP32Narrow`, or `leaPCompressedOopOffset` (depending on the heap location and size). Here, the base input of `leaP*` comes from the decode. Looking at the matching code path, we find that the `leaP*` rules match both the `AddP` and the `DecodeN`, since x86 can fold this, but the following code adds the decode back as the base input to `leaP*`:
>
> https://github.com/openjdk/jdk/blob/c29537740efb04e061732a700582d43b1956cff4/src/hotspot/share/opto/matcher.cpp#L1894-L1897
>
> On its face, this is completely unnecessary if we matched a `leaP*`, since it already computes the result of the decode, so adding the `LoadN` node as base seems like the logical choice. However, if the derived oop computed by the `leaP*` gets added to an oop map, this `DecodeN` is needed as the base for the derived oop. Because as of now, derived oops in oop maps cannot have narrow base pointers.
>
> This leaves us with a handful of possible solutions:
> 1. implement narrow bases for derived oops in oop maps,
> 2. perform some dead code elimination after we know which oops are part of oop maps,
> 3. add a peephole optimization to simply remove unused `lea`s.
>
> Option 1 would have been ideal in the sense, that it is the earliest possible point to remove the decode, which would simplify the graph and reduce pressure on the register allocator. However, rewriting the oop map machinery to remove a...
Manuel Hässig has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 29 additional commits since the last revision:
- Merge branch 'master' into JDK-8020282-lea
- Merge branch 'JDK-8020282-lea' of github.com:mhaessig/jdk into JDK-8020282-lea
- Factor out address nodes for simplification
- Add assert to codepath only reachable with stressing.
- Rename for clarity
Confused myself....
- Revert change to unrelated lines
This reverts commit d1c6a653770bfe578b1982ac726b258fa08d57b8.
- Apply typo suggestions
Co-authored-by: Roberto Castañeda Lozano <robcasloz at users.noreply.github.com>
- Add comment to benchmark as to why we fix the heap size
- Add missing null chec
- Fix typos
- ... and 19 more: https://git.openjdk.org/jdk/compare/faaac6a9...0976824c
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/25471/files
- new: https://git.openjdk.org/jdk/pull/25471/files/3d6f8972..0976824c
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=25471&range=04
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=25471&range=03-04
Stats: 92117 lines in 1600 files changed: 57473 ins; 22535 del; 12109 mod
Patch: https://git.openjdk.org/jdk/pull/25471.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/25471/head:pull/25471
PR: https://git.openjdk.org/jdk/pull/25471
More information about the hotspot-compiler-dev
mailing list