RFR: 8342662: C2: Add new phase for backend-specific lowering [v2]

Mon Oct 28 03:58:16 UTC 2024

On Fri, 25 Oct 2024 23:30:52 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> Thanks for looking at the build changes, @magicus! I've pushed a commit that removes the extra newline in the makefiles and adds newlines to the ends of files that were missing them.
>> 
>> Thanks for taking a look as well, @merykitty and @jatin-bhateja! I've pushed a commit that should address the code suggestions and left some comments as well.
>
> @jaskarth thanks for exploring platform-specific lowering!
> 
> I briefly looked through the changes, but I didn't get a good understanding of its design goals. It's hard to see what use cases it is targeted for when only skeleton code is present. It would really help if there are some cases ported on top of it for illustration purposes and to solidify the design.
> 
> Currently, there are multiple places in the code where IR lowering happens. In particular:
>   * IGVN during post loop opts phase (guarded by `Compile::post_loop_opts_phase()`) (Ideal -> Ideal); 
>   * macro expansion (Ideal -> Ideal);
>   * ad-hoc passes (GC barrier expansion, `MacroLogicV`) (Ideal -> Ideal);
>   * final graph reshaping (Ideal -> Ideal);
>   * matcher (Ideal -> Mach).
> 
> I'd like to understand how the new pass is intended to interact with existing cases.
> 
> Only the last one is truly platform-specific, but there are some platform-specific cases exposes in other places (e.g., MacroLogicV pass, DivMod combining) guarded by some predicates on `Matcher`.
> 
> As the `PhaseLowering` is implemented now, it looks like a platform-specific macro expansion pass (1->many rewriting). But `MacroLogicV` case doesn't fit such model well.
> 
> I see changes to enable platform-specific node classes. As of now, only Matcher introduces platform-specific nodes and all of them are Mach nodes. Platform-specific Ideal nodes are declared in shared code, but then their usages are guarded by `Matcher::has_match_rule()` thus ensuring there's enough support on back-end side.
> 
> Some random observations:
> * the pass is performed unconditionally and it iterates over all live nodes; in contrast, macro nodes and nodes for post-loop opts IGVN are explicitly listed on the side (MacroLogicV pass is also guarded, but by a coarse-grained check);

Thanks a lot for your analysis of the patch, @iwanowww! I hope to answer some of your questions here.

> It's hard to see what use cases it is targeted for when only skeleton code is present. It would really help if there are some cases ported on top of it for illustration purposes and to solidify the design.

I think this is a very fair point. I was testing some cases before I made the PR, but I wanted to submit just the system in isolation to make it easier to review. I can make some example use cases separately to show what could be possible with the new system.

> I'd like to understand how the new pass is intended to interact with existing cases.

The overarching goal is to support new kinds of transforms on ideal nodes that are only relevant to a single hardware platform, which would otherwise be too narrow in scope to put in shared code but would be difficult to do in purely AD code. It can be helpful having GVN while transforming the IR into a more backend-specific form. @merykitty added some nice examples above that illustrate possible use-cases.

> As the `PhaseLowering` is implemented now, it looks like a platform-specific macro expansion pass (1->many rewriting). But MacroLogicV case doesn't fit such model well.

The lowering implementation works similarly to how an `Ideal()` call works, so it's possible to do many->1 (like `MacroLogicV`) and many->many transformations as well.

> I see changes to enable platform-specific node classes. As of now, only Matcher introduces platform-specific nodes and all of them are Mach nodes.

I was thinking if we're introducing nodes that only have functionality on specific platforms it might be nice to make those nodes only exist on those platforms as well, to reduce the size of shared code on platforms where the nodes aren't relevant. Since the lowering phase introduces new nodes that are specially known to the backend they should be supported by the backend too. However, it's not a necessary component of the lowering phase, just something that I thought could help with the implementation of lowered nodes.

> the pass is performed unconditionally and it iterates over all live nodes; in contrast, macro nodes and nodes for post-loop opts IGVN are explicitly listed on the side (MacroLogicV pass is also guarded, but by a coarse-grained check);

This is true, my thought was since MacroLogicV currently also iterates across all live nodes doing it here as well would be alright. I think a way to collect lowering-specific nodes would be difficult since the nodes that actually get lowered could change between backends. I did some testing on compile time with `-XX:+CITime`, and it seems like the impact is negligible (at least with the skeleton code).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21599#issuecomment-2440496414