RFR: 8318650: Optimized subword gather for x86 targets. [v14]

Jatin Bhateja jbhateja at openjdk.org
Mon Feb 26 15:02:00 UTC 2024


On Mon, 26 Feb 2024 13:31:05 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> At the risk of becoming too nit-picky: which allocations are you talking about? Given you only have a single src and a single dst for this label/jump. So you won't use `_patch_overflow`. And therefore, all allocations are on the stack. The way you do it now, it seems you would allocate 4x the stack memory here, compared to doing it locally in the loop, where the stack space could potentially be reused between the iterations.
>> It seems to me this is an optimization at the cost of code-style. Having them local makes it more clear that you are only jumping inside a iteration, and not between iterations.
>
> I could not find any other case with the same pattern, of initializing a list of Labels.
> 
> On the other hand, I can find cases where we already do what I am saying:
> `C2_MacroAssembler::rtm_counters_update`

Hi @eme64 , I was referring to allocation of label's array.  To be concise and avoid hand unrolling of loop, I chose an array of labels.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1502752772


More information about the core-libs-dev mailing list