RFR: 8334431: C2 SuperWord: fix performance regression due to store-to-load-forwarding failures [v2]
Emanuel Peter
epeter at openjdk.org
Tue Nov 19 15:42:22 UTC 2024
On Tue, 19 Nov 2024 15:29:30 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:
>>> Should we also mention here that it also works when the loaded data is fully contained in the stored data.
>>
>> fully contained, as in `strict subset`? I mentioned that already... and sadly it works on some platforms, but not others... quite complex. That is why I make the "conservative assumption".
>
> Ah, I thought that as long as the starting addresses match, then all platforms will do the optimization when we store more bytes than we load. But that's not the case then? But of course for the analysis we do in Superword, we only assume that exact matches will work.
Yes, exactly. In general the CPU can be smarter, but we assume only exact matches are successes - all others failure if they overlap in any way.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21521#discussion_r1848591954
More information about the hotspot-compiler-dev
mailing list