RFR: 8325467: Support methods with many arguments in C2 [v17]
Emanuel Peter
epeter at openjdk.org
Tue Apr 22 17:37:52 UTC 2025
On Thu, 10 Apr 2025 09:44:14 GMT, Daniel Lundén <dlunden at openjdk.org> wrote:
>> If a method has a large number of parameters, we currently bail out from C2 compilation.
>>
>> ### Changeset
>>
>> Allowing C2 compilation of methods with a large number of parameters requires fundamental changes to the register mask data structure, used in many places in C2. In particular, register masks currently have a statically determined size and cannot represent arbitrary numbers of stack slots. This is needed if we want to compile methods with arbitrary numbers of parameters. Register mask operations are present in performance-sensitive parts of C2, which further complicates changes.
>>
>> Changes:
>> - Add functionality to dynamically grow/extend register masks. I experimented with a number of design choices to achieve this. To keep the common case (normal number of method parameters) quick and also to avoid more intrusive changes to the current `RegMask` interface, I decided to leave the "base" statically allocated memory for masks unchanged and only use dynamically allocated memory in the rare cases where it is needed.
>> - Generalize the "chunk"-logic from `PhaseChaitin::Select()` to allow arbitrary-sized chunks, and also move most of the logic into register mask methods to separate concerns and to make the `PhaseChaitin::Select()` code more readable.
>> - Remove all `can_represent` checks and bailouts.
>> - Performance tuning. A particularly important change is the early-exit optimization in `RegMask::overlap`, used in the performance-sensitive method `PhaseChaitin::interfere_with_live`.
>> - Add a new test case `TestManyMethodArguments.java` and extend an old test `TestNestedSynchronize.java`.
>>
>> ### Testing
>>
>> - [GitHub Actions](https://github.com/dlunde/jdk/actions/runs/10178060450)
>> - `tier1` to `tier4` (and additional Oracle-internal testing) on Windows x64, Linux x64, Linux aarch64, macOS x64, and macOS aarch64.
>> - Standard performance benchmarking. No observed conclusive overall performance degradation/improvement.
>> - Specific benchmarking of C2 compilation time. The changes increase C2 compilation time by, approximately and on average, 1% for methods that could also be compiled before this changeset (see the figure below). The reason for the degradation is further checks required in performance-sensitive code (in particular `PhaseChaitin::remove_bound_register_from_interfering_live_ranges`). I have tried optimizing in various ways, but changes I found that lead to improvement also lead to less readable code (and are, in my opinion, no...
>
> Daniel Lundén has updated the pull request incrementally with one additional commit since the last revision:
>
> Refactor and improve TestNestedSynchronize.java
Wow, I'm very unfamiliar with this code. Thanks for working on this, it must have taken a while to dig into this, thanks for the work :)
I have a few little comments below, need more time to dig deeper down.
src/hotspot/share/opto/chaitin.cpp line 1648:
> 1646: // If we fail to color and the AllStack flag is set, trigger
> 1647: // a chunk-rollover event
> 1648: if (!OptoReg::is_valid(reg) && is_allstack) {
Control question: here we seem to have passed `-chunk`. And we used to check `chunk != 0` in lots of places. What was the significance to negative chunk? Did you think about that?
src/hotspot/share/opto/compile.hpp line 524:
> 522: PhaseRegAlloc* _regalloc; // Results of register allocation.
> 523: RegMask _FIRST_STACK_mask; // All stack slots usable for spills (depends on frame layout)
> 524: ResourceArea _regmask_arena; // Holds dynamically allocated extensions of short-lived register masks
If they are short-lived ... then why not just resource allocate them? Is there a conflict? Would it be good to describe that somewhere?
src/hotspot/share/opto/regmask.hpp line 147:
> 145: // If the original version, of which we may be a clone, is read-only. In such
> 146: // cases, we can allow read-only sharing.
> 147: bool orig_const = false;
Could we have a more descriptive name? I was wondering at a use site what this is about... and it was not directly clear. Oh, and should it not be `_orig_const`, with the extra underscore at the very least?
Maybe `_read_only_sharing`? Not sure about it, just an idea. But the underscore I'm more sure about.
src/hotspot/share/opto/regmask.hpp line 420:
> 418: : RegMask(arena) {
> 419: Insert(reg);
> 420: DEBUG_ONLY(this->orig_const = orig_const;)
Could this not be part of the initializer list? Or does the `Insert` prevent that?
-------------
PR Review: https://git.openjdk.org/jdk/pull/20404#pullrequestreview-2784774928
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2054519051
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2054532956
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2054552632
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2054553843
More information about the hotspot-compiler-dev
mailing list