RFR: 8341697: C2: Register allocation inefficiency in tight loop

Fri Oct 11 16:05:11 UTC 2024

On Fri, 11 Oct 2024 15:50:20 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

> Hi,
> 
> This patch improves the spill placement in the presence of loops. Currently, when trying to spill a live range, we will create a `Phi` at the loop head, this `Phi` will then be spilt inside the loop body, and as the `Phi` is `UP` (lives in register) at the loop head, we need to emit an additional reload at the loop back-edge block. This introduces loop-carried dependencies, greatly reduces loop throughput. My proposal is that if a node is not reassigned inside a loop, and will be spilt there, we spill it eagerly at the loop entry instead. This can lead to more reload inside the loop, but as the loop-carried dependencies are eliminated, a load is negligible.
> 
> Please take a look and leave your reviews, thanks a lot.

Thanks to @shipilev for the benchmark, could you verify that this can solve the issue in the original benchmark as I imagine this is a simplified version.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21472#issuecomment-2407710352