RFR: 8325467: Support methods with many arguments in C2 [v17]

Daniel Lundén dlunden at openjdk.org
Fri Apr 25 18:14:53 UTC 2025


On Tue, 22 Apr 2025 17:06:04 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Daniel Lundén has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Refactor and improve TestNestedSynchronize.java
>
> src/hotspot/share/opto/chaitin.cpp line 1614:
> 
>> 1612:           uint size = lrg->mask().Size();
>> 1613:           ResourceMark r(C->regmask_arena());
>> 1614:           RegMask rm(lrg->mask(), C->regmask_arena());
> 
> Absolute nit: above you already have a `rm`. There it refers to a `ResourceMark`, here to a `RegMask`. Feels a little confusing. A more expressive name could help ;)

Yes, thanks. It is unfortunate that both `RegMask` and `ResourceMask` variables are usually named `rm`. I updated it here and also in other places now.

> src/hotspot/share/opto/compile.hpp line 524:
> 
>> 522:   PhaseRegAlloc*        _regalloc;              // Results of register allocation.
>> 523:   RegMask               _FIRST_STACK_mask;      // All stack slots usable for spills (depends on frame layout)
>> 524:   ResourceArea          _regmask_arena;         // Holds dynamically allocated extensions of short-lived register masks
> 
> If they are short-lived ... then why not just resource allocate them? Is there a conflict? Would it be good to describe that somewhere?

I assume you mean resource allocate in the default `Thread::current()->resource_area()`? The existing resource marks are often far away, leading to unnecessary memory consumption. Trying to narrow the existing resource marks does lead to conflicts. I added a motivation to the description in `compile.hpp`!

> src/hotspot/share/opto/matcher.cpp line 1340:
> 
>> 1338:   // Empty them all.
>> 1339:   for (uint i = 0; i < cnt; i++) {
>> 1340:     ::new (&(msfpt->_in_rms[i])) RegMask(C->comp_arena());
> 
> Here you do array indexing with `&`. Above you did `base + i`. Just an observation, I leave it to you if you want to do something about that.

Thanks, updated for consistency.

> src/hotspot/share/opto/regmask.hpp line 143:
> 
>> 141:   // constructors. If we get copied/cloned, &_RM_UP_EXT will no longer equal
>> 142:   // orig_ext_adr.
>> 143:   uintptr_t** orig_ext_adr = &_RM_UP_EXT;
> 
> There should also be an underscore for this field, for consistency, right?

Yes, fixed 👍

> src/hotspot/share/opto/regmask.hpp line 147:
> 
>> 145:   // If the original version, of which we may be a clone, is read-only. In such
>> 146:   // cases, we can allow read-only sharing.
>> 147:   bool orig_const = false;
> 
> Could we have a more descriptive name? I was wondering at a use site what this is about... and it was not directly clear. Oh, and should it not be `_orig_const`, with the extra underscore at the very least?
> 
> Maybe `_read_only_sharing`? Not sure about it, just an idea. But the underscore I'm more sure about.

Sure, I now changed it to simply `_read_only`.

> src/hotspot/share/opto/regmask.hpp line 182:
> 
>> 180:   //   r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 s0 s1   s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 ...
>> 181:   //  [0  0  0  0 |0  1  1  0 |0  0  1  0 ] [1  1  0  1 |0  0  0  0] as  as  as
>> 182:   // [0]         [1]         [2]           [0]         [1]
> 
> I kinda missed these two lines at first... it is not directly clear what they are for.
> The first one is the mask, aaah and it is extended with the `as` values.
> No idea yet what the second line is about...
> 
> Maybe some annotation would help? For me it would ok if that means the lines become a little longer.

Sure, now added "Content" and "Index" annotations.

> src/hotspot/share/opto/regmask.hpp line 219:
> 
>> 217: 
>> 218:   // Access word i in the register mask.
>> 219:   const uintptr_t& _rm_up(unsigned int i) const {
> 
> You have another set of underscore methods here... did you mean to make them private? I quickly checked, and did not find any underscore methods in the file before your change.
> 
> This looks a little like python style, where you cannot make a method private, and so the convention is to make them underscore :)

Thanks, fixed! I was not aware of the convention. The methods are already private.

> src/hotspot/share/opto/regmask.hpp line 420:
> 
>> 418:       : RegMask(arena) {
>> 419:     Insert(reg);
>> 420:     DEBUG_ONLY(this->orig_const = orig_const;)
> 
> Could this not be part of the initializer list? Or does the `Insert` prevent that?

I moved things around a bit to make sure it can be put in the initializer list. I couldn't before because the initializer list contains a delegating constructor.

> src/hotspot/share/opto/regmask.hpp line 429:
> 
>> 427:   }
>> 428: 
>> 429:   RegMask(const RegMask& rm) : RegMask(rm, nullptr) {}
> 
> This is the copy constructor, right? Can you add a comment what kind of implementation you chose here, and why?

Yes, it is the copy constructor. Can you elaborate a bit on what type of comment you expect? There are already some comments in the `copy` method.

> test/hotspot/jtreg/compiler/arguments/TestMaxMethodArguments.java line 34:
> 
>> 32:  *                   -XX:+UnlockDiagnosticVMOptions
>> 33:  *                   -XX:+AbortVMOnCompilationFailure
>> 34:  *                   compiler.arguments.TestMaxMethodArguments
> 
> Could there be a benefit to having a run without some of the flags here, and therefore also without the requires?

Yes, good suggestion. Added!

> test/hotspot/jtreg/compiler/locks/TestNestedSynchronize.java line 37:
> 
>> 35:  *                   -XX:+UnlockDiagnosticVMOptions
>> 36:  *                   -XX:+AbortVMOnCompilationFailure
>> 37:  *                   compiler.locks.TestNestedSynchronize
> 
> How about a run with fewer flags?

Yes, thanks 👍

> test/jdk/java/lang/invoke/TestCatchExceptionWithVarargs.java line 32:
> 
>> 30:  *          timeouts due to compilation of a large number of methods with a
>> 31:  *          large number of parameters.
>> 32:  * @run main/othervm -XX:MaxNodeLimit=15000 TestCatchExceptionWithVarargs
> 
> Why not have two runs here. One that requires that there is no `Xcomp`, where we have the normal node limit. And one where you lower the node limit, so that `Xcomp` is ok?

Same here, I just added the `MaxNodeLimit`. I'd prefer leaving any other changes to a separate RFE (if needed).

> test/jdk/java/lang/invoke/VarargsArrayTest.java line 46:
> 
>> 44:  *                                 -DVarargsArrayTest.MAX_ARITY=255
>> 45:  *                                 -DVarargsArrayTest.START_ARITY=250
>> 46:  *                                 VarargsArrayTest
> 
> Same here.
> 
> But I mean are those timeouts ok with Xcomp? Are we sure that these timeouts are only a test issue and not a product issue?

There is a thread in the PR about this already (difficult to find!), so I'm pasting it below for convenience.

> @[robcasloz](https://github.com/robcasloz) robcasloz [3 weeks ago](https://github.com/openjdk/jdk/pull/20404#discussion_r2023195229)
Just checking: these methods that cause C2 to consume an excessive amount of memory were not C2-compilable before the changeset, right?
> 
> @[robcasloz](https://github.com/robcasloz) robcasloz [3 weeks ago](https://github.com/openjdk/jdk/pull/20404#discussion_r2023204041)
Same question for the other java/lang/invoke test changes.
> 
> @[dlunde](https://github.com/dlunde) dlunde [3 weeks ago](https://github.com/openjdk/jdk/pull/20404#discussion_r2028661113)
Yes, correct. No longer bailing out on too many arguments results in a lot more compilations (with -Xcomp) compared to before in these specific tests, which is why I've had to limit the tests with MaxNodeLimits.
> 
> That said, I did look into these tests a bit more now after your comment, and there are some peculiar (but artificial) compilations that we no longer bail out on and that we may want to investigate in a future RFE. These compilations each take around 40 seconds (in a release build), are very close to the MaxNodeLimit (80 000 nodes), and spend 99% of the time in the register allocator (in the first round of conservative coalescing, specifically). I analyzed these register allocator runs and it looks like we run into the quadratic time complexity of graph-coloring register allocation, because we have a very large number of nodes to begin with and then the interference graph is additionally very dense (contains a very large number of interferences/edges). We already have bailouts related to node count in the register allocator, but no bailouts for the interference graph size. Perhaps we should consider adding this as part of a separate RFE.
> 
> @[robcasloz](https://github.com/robcasloz) robcasloz [3 weeks ago](https://github.com/openjdk/jdk/pull/20404#discussion_r2028671476)
> > Perhaps we should consider adding this as part of a separate RFE.
> 
> This sounds like a good idea, I agree to postpone it to a separate RFE.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060669571
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060670292
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060673178
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060675000
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060671473
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060675790
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060677513
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060672618
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060678381
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060667754
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060667653
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060667453
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060667370


More information about the hotspot-compiler-dev mailing list