RFR: 8342662: C2: Add new phase for backend-specific lowering [v2]

Tue Oct 29 23:24:11 UTC 2024

On Thu, 24 Oct 2024 01:26:10 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:

>> Hi all,
>> This patch adds a new pass to consolidate lowering of complex backend-specific code patterns, such as `MacroLogicV` and the optimization proposed by #21244. Moving these optimizations to backend code can simplify shared code, while also making it easier to develop more in-depth optimizations. The linked bug has an example of a new optimization this could enable. The new phase does GVN to de-duplicate nodes and calls nodes' `Value()` method, but it does not call `Identity()` or `Ideal()` to avoid undoing any changes done during lowering. It also reuses the IGVN worklist to avoid needing to re-create the notification mechanism.
>> 
>> In this PR only the skeleton code for the pass is added, moving `MacroLogicV` to this system will be done separately in a future patch. Tier 1 tests pass on my linux x64 machine. Feedback on this patch would be greatly appreciated!
>
> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Address some changes from code review

I'd like to reiterate that the primary place in C2 designated for platform-specific IR transformations is during matching phase. That's where the vast majority of cases is covered. Everything else are special cases which are handled in different places (mostly due to historic reasons and differing requirements). 

> It would not be possible without a stretch, consider my example regarding ExtractINode above, ...

Matcher exposes a number of constants and queries which are used in shared code to sense platform capabilities (akin to `VM_Version`).

> What do you think about keeping the node declaration in shared code but putting the lowering transformations in the backend-specific source files? 

I definitely prefer shared over platform-specific Ideal node declarations.

> We can then use prefixes to denote a node being available on a specific backend only.

Ideal-to-Mach IR lowering is already partial as `Matcher::match_rule_support` usages illustrate.

> That's why it is intended to be executed only after general igvn.

Keep in mind that final graph reshaping also resides in shared code. 

> Macro expansion would be too early, as we still do platform-independent igvn there, while final graph reshaping and custom matching logic would be too late, as we have destroyed the node hash table already.

The nice thing about macro expansion is that it is performed consistently. Every macro node is unconditionally expanded and subsequent passes don't have to care about them. (The same applies to matching: Ideal nodes are consistently turned into platform-specific Mach nodes, except a very limited number of well-known cases.) 

I'd say that if we want the lowering pass being discussed to be truly scalable, it's better to follow the same pattern. I have some doubts that platform-specific ad-hoc IR tweaks scale will scale well.  

BTW it's not clear to me now what particular benefits IGVN brings. `DivMod` transformation doesn't use IGVN and after examining `MacroLogicV` code it can be rewritten to avoid IGVN as well. I believe that `ExtractI` case you mentioned can be implemented without relying on IGVN. In that case, lowering transformations can be moved to a later phase.

> I don't think this is a concern, enumerating all live nodes once without doing anything is not expensive.

I do have some concerns about unconditionally performing something which is useless most of the time (or even completely redundant on some platforms).

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21599#issuecomment-2445501092