RFR: 8360175: C2 crash: assert(edge_from_to(prior_use, n)) failed: before block local scheduling

Christian Hagedorn chagedorn at openjdk.org
Tue Jul 8 08:05:42 UTC 2025


On Mon, 7 Jul 2025 09:47:09 GMT, Manuel Hässig <mhaessig at openjdk.org> wrote:

> The triggered assert is part of the schedule verification code that runs just before machine code is emitted. The debug output showed that a `leaPCompressedOopOffset` node was causing the assert, which suggested the peephole optimization introduced in #25471 as the cause. The failure proved quite difficult to reproduce. It failed more often on Windows and required `-XX:+UseKNLSetting` (forces code generation for Intel's Knights Landing platform), which forces `-XX:+OptoScheduling`.
> 
> The root-cause is a subtle bug in the rewiring of the base edge of `leaP*` nodes in the `remove_redundant_lea` peephole. When the peephole removed a `decodeHeapOop_not_null` including a spill, it did not set the base edge of the `leaP*` node to the same node as the address edge, which is the intent of the peephole, but to the parent node of the spill. That is not catastrophic in most cases, but might reference another register slot, which causes this assert. Concretely, we see the following graph
> 
>     MemToRegSpillCopy
>      |             |
>      |    MemToRegSpillCopy
>      |             |    
> DefiniinoSpillCopy |
>      |             |
>      |  decodeHeapOop_not_null
>      |             |
>    leaPCompressedHeapOop
> 
> gets rewired to
> 
>      MemToRegSpillCopy
>        |            |    
> DefinitionSpillCopy |
>        |            |
>    leaPCompressedHeapOop
> 
> instead of
> 
>   MemToRegSpillCopy
>          |
>  DefinitionSpillCopy
>         / \     
> leaPCompressedHeapOop
> 
> 
> This PR fixes this by always setting the base edge of the `leaP*` node to the same node as the address edge. Unfortunately, I was not able to construct a regression test because of the difficulty of reproducing the bug.
> 
> # Testing
> 
> - [ ] Github Actions
> - [x] tier1,tier2 plus internal testing on all Oracle supported platforms
> - [x] tier3,tier4,tier5 plus internal testing on Linux and Windows x64
> - [ ] Runthese8H on `windows-x64-debug` (test that reliably produced the failure addressed in this PR)

Marked as reviewed by chagedorn (Reviewer).

src/hotspot/cpu/x86/peephole_x86_64.cpp line 349:

> 347:     Node* dependant_lea = decode->fast_out(i);
> 348:     if (dependant_lea->is_Mach() && dependant_lea->as_Mach()->ideal_Opcode() == Op_AddP) {
> 349:       dependant_lea->set_req(AddPNode::Base, lea_derived_oop->in(AddPNode::Address));

The fix looks reasonable to me, too. No worries about the regression test, thanks for trying! A small question: Why don't we use `lea_address`?

Another thing I've noticed while browsing the code: `ra_` and `new_root` seem to be unused and could be removed (could probably also be squeezed into this PR here instead of creating a new issue just for that).

-------------

PR Review: https://git.openjdk.org/jdk/pull/26157#pullrequestreview-2996452308
PR Review Comment: https://git.openjdk.org/jdk/pull/26157#discussion_r2191778511


More information about the hotspot-compiler-dev mailing list