RFR: 8325467: Support methods with many arguments in C2 [v17]
Daniel Lundén
dlunden at openjdk.org
Fri Apr 25 18:14:53 UTC 2025
On Tue, 22 Apr 2025 17:06:04 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Daniel Lundén has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Refactor and improve TestNestedSynchronize.java
>
> src/hotspot/share/opto/chaitin.cpp line 1614:
>
>> 1612: uint size = lrg->mask().Size();
>> 1613: ResourceMark r(C->regmask_arena());
>> 1614: RegMask rm(lrg->mask(), C->regmask_arena());
>
> Absolute nit: above you already have a `rm`. There it refers to a `ResourceMark`, here to a `RegMask`. Feels a little confusing. A more expressive name could help ;)
Yes, thanks. It is unfortunate that both `RegMask` and `ResourceMask` variables are usually named `rm`. I updated it here and also in other places now.
> src/hotspot/share/opto/compile.hpp line 524:
>
>> 522: PhaseRegAlloc* _regalloc; // Results of register allocation.
>> 523: RegMask _FIRST_STACK_mask; // All stack slots usable for spills (depends on frame layout)
>> 524: ResourceArea _regmask_arena; // Holds dynamically allocated extensions of short-lived register masks
>
> If they are short-lived ... then why not just resource allocate them? Is there a conflict? Would it be good to describe that somewhere?
I assume you mean resource allocate in the default `Thread::current()->resource_area()`? The existing resource marks are often far away, leading to unnecessary memory consumption. Trying to narrow the existing resource marks does lead to conflicts. I added a motivation to the description in `compile.hpp`!
> src/hotspot/share/opto/matcher.cpp line 1340:
>
>> 1338: // Empty them all.
>> 1339: for (uint i = 0; i < cnt; i++) {
>> 1340: ::new (&(msfpt->_in_rms[i])) RegMask(C->comp_arena());
>
> Here you do array indexing with `&`. Above you did `base + i`. Just an observation, I leave it to you if you want to do something about that.
Thanks, updated for consistency.
> src/hotspot/share/opto/regmask.hpp line 143:
>
>> 141: // constructors. If we get copied/cloned, &_RM_UP_EXT will no longer equal
>> 142: // orig_ext_adr.
>> 143: uintptr_t** orig_ext_adr = &_RM_UP_EXT;
>
> There should also be an underscore for this field, for consistency, right?
Yes, fixed 👍
> src/hotspot/share/opto/regmask.hpp line 147:
>
>> 145: // If the original version, of which we may be a clone, is read-only. In such
>> 146: // cases, we can allow read-only sharing.
>> 147: bool orig_const = false;
>
> Could we have a more descriptive name? I was wondering at a use site what this is about... and it was not directly clear. Oh, and should it not be `_orig_const`, with the extra underscore at the very least?
>
> Maybe `_read_only_sharing`? Not sure about it, just an idea. But the underscore I'm more sure about.
Sure, I now changed it to simply `_read_only`.
> src/hotspot/share/opto/regmask.hpp line 182:
>
>> 180: // r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 s0 s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 ...
>> 181: // [0 0 0 0 |0 1 1 0 |0 0 1 0 ] [1 1 0 1 |0 0 0 0] as as as
>> 182: // [0] [1] [2] [0] [1]
>
> I kinda missed these two lines at first... it is not directly clear what they are for.
> The first one is the mask, aaah and it is extended with the `as` values.
> No idea yet what the second line is about...
>
> Maybe some annotation would help? For me it would ok if that means the lines become a little longer.
Sure, now added "Content" and "Index" annotations.
> src/hotspot/share/opto/regmask.hpp line 219:
>
>> 217:
>> 218: // Access word i in the register mask.
>> 219: const uintptr_t& _rm_up(unsigned int i) const {
>
> You have another set of underscore methods here... did you mean to make them private? I quickly checked, and did not find any underscore methods in the file before your change.
>
> This looks a little like python style, where you cannot make a method private, and so the convention is to make them underscore :)
Thanks, fixed! I was not aware of the convention. The methods are already private.
> src/hotspot/share/opto/regmask.hpp line 420:
>
>> 418: : RegMask(arena) {
>> 419: Insert(reg);
>> 420: DEBUG_ONLY(this->orig_const = orig_const;)
>
> Could this not be part of the initializer list? Or does the `Insert` prevent that?
I moved things around a bit to make sure it can be put in the initializer list. I couldn't before because the initializer list contains a delegating constructor.
> src/hotspot/share/opto/regmask.hpp line 429:
>
>> 427: }
>> 428:
>> 429: RegMask(const RegMask& rm) : RegMask(rm, nullptr) {}
>
> This is the copy constructor, right? Can you add a comment what kind of implementation you chose here, and why?
Yes, it is the copy constructor. Can you elaborate a bit on what type of comment you expect? There are already some comments in the `copy` method.
> test/hotspot/jtreg/compiler/arguments/TestMaxMethodArguments.java line 34:
>
>> 32: * -XX:+UnlockDiagnosticVMOptions
>> 33: * -XX:+AbortVMOnCompilationFailure
>> 34: * compiler.arguments.TestMaxMethodArguments
>
> Could there be a benefit to having a run without some of the flags here, and therefore also without the requires?
Yes, good suggestion. Added!
> test/hotspot/jtreg/compiler/locks/TestNestedSynchronize.java line 37:
>
>> 35: * -XX:+UnlockDiagnosticVMOptions
>> 36: * -XX:+AbortVMOnCompilationFailure
>> 37: * compiler.locks.TestNestedSynchronize
>
> How about a run with fewer flags?
Yes, thanks 👍
> test/jdk/java/lang/invoke/TestCatchExceptionWithVarargs.java line 32:
>
>> 30: * timeouts due to compilation of a large number of methods with a
>> 31: * large number of parameters.
>> 32: * @run main/othervm -XX:MaxNodeLimit=15000 TestCatchExceptionWithVarargs
>
> Why not have two runs here. One that requires that there is no `Xcomp`, where we have the normal node limit. And one where you lower the node limit, so that `Xcomp` is ok?
Same here, I just added the `MaxNodeLimit`. I'd prefer leaving any other changes to a separate RFE (if needed).
> test/jdk/java/lang/invoke/VarargsArrayTest.java line 46:
>
>> 44: * -DVarargsArrayTest.MAX_ARITY=255
>> 45: * -DVarargsArrayTest.START_ARITY=250
>> 46: * VarargsArrayTest
>
> Same here.
>
> But I mean are those timeouts ok with Xcomp? Are we sure that these timeouts are only a test issue and not a product issue?
There is a thread in the PR about this already (difficult to find!), so I'm pasting it below for convenience.
> @[robcasloz](https://github.com/robcasloz) robcasloz [3 weeks ago](https://github.com/openjdk/jdk/pull/20404#discussion_r2023195229)
Just checking: these methods that cause C2 to consume an excessive amount of memory were not C2-compilable before the changeset, right?
>
> @[robcasloz](https://github.com/robcasloz) robcasloz [3 weeks ago](https://github.com/openjdk/jdk/pull/20404#discussion_r2023204041)
Same question for the other java/lang/invoke test changes.
>
> @[dlunde](https://github.com/dlunde) dlunde [3 weeks ago](https://github.com/openjdk/jdk/pull/20404#discussion_r2028661113)
Yes, correct. No longer bailing out on too many arguments results in a lot more compilations (with -Xcomp) compared to before in these specific tests, which is why I've had to limit the tests with MaxNodeLimits.
>
> That said, I did look into these tests a bit more now after your comment, and there are some peculiar (but artificial) compilations that we no longer bail out on and that we may want to investigate in a future RFE. These compilations each take around 40 seconds (in a release build), are very close to the MaxNodeLimit (80 000 nodes), and spend 99% of the time in the register allocator (in the first round of conservative coalescing, specifically). I analyzed these register allocator runs and it looks like we run into the quadratic time complexity of graph-coloring register allocation, because we have a very large number of nodes to begin with and then the interference graph is additionally very dense (contains a very large number of interferences/edges). We already have bailouts related to node count in the register allocator, but no bailouts for the interference graph size. Perhaps we should consider adding this as part of a separate RFE.
>
> @[robcasloz](https://github.com/robcasloz) robcasloz [3 weeks ago](https://github.com/openjdk/jdk/pull/20404#discussion_r2028671476)
> > Perhaps we should consider adding this as part of a separate RFE.
>
> This sounds like a good idea, I agree to postpone it to a separate RFE.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060669571
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060670292
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060673178
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060675000
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060671473
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060675790
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060677513
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060672618
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060678381
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060667754
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060667653
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060667453
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2060667370
More information about the hotspot-compiler-dev
mailing list