RFR: 8325467: Support methods with many arguments in C2 [v17]

Emanuel Peter epeter at openjdk.org
Wed Apr 23 08:49:13 UTC 2025


On Thu, 10 Apr 2025 09:44:14 GMT, Daniel Lundén <dlunden at openjdk.org> wrote:

>> If a method has a large number of parameters, we currently bail out from C2 compilation.
>> 
>> ### Changeset
>> 
>> Allowing C2 compilation of methods with a large number of parameters requires fundamental changes to the register mask data structure, used in many places in C2. In particular, register masks currently have a statically determined size and cannot represent arbitrary numbers of stack slots. This is needed if we want to compile methods with arbitrary numbers of parameters. Register mask operations are present in performance-sensitive parts of C2, which further complicates changes.
>> 
>> Changes:
>> - Add functionality to dynamically grow/extend register masks. I experimented with a number of design choices to achieve this. To keep the common case (normal number of method parameters) quick and also to avoid more intrusive changes to the current `RegMask` interface, I decided to leave the "base" statically allocated memory for masks unchanged and only use dynamically allocated memory in the rare cases where it is needed.
>> - Generalize the "chunk"-logic from `PhaseChaitin::Select()` to allow arbitrary-sized chunks, and also move most of the logic into register mask methods to separate concerns and to make the `PhaseChaitin::Select()` code more readable.
>> - Remove all `can_represent` checks and bailouts.
>> - Performance tuning. A particularly important change is the early-exit optimization in `RegMask::overlap`, used in the performance-sensitive method `PhaseChaitin::interfere_with_live`.
>> - Add a new test case `TestManyMethodArguments.java` and extend an old test `TestNestedSynchronize.java`.
>> 
>> ### Testing
>> 
>> - [GitHub Actions](https://github.com/dlunde/jdk/actions/runs/10178060450)
>> - `tier1` to `tier4` (and additional Oracle-internal testing) on Windows x64, Linux x64, Linux aarch64, macOS x64, and macOS aarch64.
>> - Standard performance benchmarking. No observed conclusive overall performance degradation/improvement.
>> - Specific benchmarking of C2 compilation time. The changes increase C2 compilation time by, approximately and on average, 1% for methods that could also be compiled before this changeset (see the figure below). The reason for the degradation is further checks required in performance-sensitive code (in particular `PhaseChaitin::remove_bound_register_from_interfering_live_ranges`). I have tried optimizing in various ways, but changes I found that lead to improvement also lead to less readable code (and are, in my opinion, no...
>
> Daniel Lundén has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Refactor and improve TestNestedSynchronize.java

And a few comments for the tests :)

test/hotspot/jtreg/compiler/arguments/TestMaxMethodArguments.java line 34:

> 32:  *                   -XX:+UnlockDiagnosticVMOptions
> 33:  *                   -XX:+AbortVMOnCompilationFailure
> 34:  *                   compiler.arguments.TestMaxMethodArguments

Could there be a benefit to having a run without some of the flags here, and therefore also without the requires?

test/hotspot/jtreg/compiler/arguments/TestMaxMethodArguments.java line 43:

> 41:         try {
> 42:             test(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217
 , 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255);
> 43:         } catch (Exception e) {

That makes me a little nervous. You catch all exceptions. What if we throw some unexpected null pointer exception? Is that fine too?

Suggestion: define your own exception, and only catch that one.

test/hotspot/jtreg/compiler/locks/TestNestedSynchronize.java line 37:

> 35:  *                   -XX:+UnlockDiagnosticVMOptions
> 36:  *                   -XX:+AbortVMOnCompilationFailure
> 37:  *                   compiler.locks.TestNestedSynchronize

How about a run with fewer flags?

test/hotspot/jtreg/compiler/locks/TestNestedSynchronize.java line 90:

> 88:     // The above is a massive program. Therefore, we do not directly inline the
> 89:     // program in TestNestedSynchronize and instead compile and run it via the
> 90:     // CompileFramework.

Nice, I like it. Of course I have no bias here ;)

test/hotspot/jtreg/compiler/locks/TestNestedSynchronize.java line 107:

> 105:         acc.addFirst(String.format("public class %s {", test_class_name));
> 106:         acc.addLast("}");
> 107:         return String.join("\n", acc);

Nit: What prevented you from generating it with a `ArrayList`, and just only append at the end?

In fact, that would allow you to use a StringBuilder directly.

And you could format the `synchronized` string only once. That might be even more efficient.

You can also leave it, this is really a nit :)

test/jdk/java/lang/invoke/BigArityTest.java line 38:

> 36:  *                                 -XX:CompileCommand=memlimit,*.*,0
> 37:  *                                 -esa -DBigArityTest.ITERATION_COUNT=1
> 38:  *                                 test.java.lang.invoke.BigArityTest

Would it make sense to also have a run with fewer flags?

test/jdk/java/lang/invoke/TestCatchExceptionWithVarargs.java line 32:

> 30:  *          timeouts due to compilation of a large number of methods with a
> 31:  *          large number of parameters.
> 32:  * @run main/othervm -XX:MaxNodeLimit=15000 TestCatchExceptionWithVarargs

Why not have two runs here. One that requires that there is no `Xcomp`, where we have the normal node limit. And one where you lower the node limit, so that `Xcomp` is ok?

test/jdk/java/lang/invoke/VarargsArrayTest.java line 46:

> 44:  *                                 -DVarargsArrayTest.MAX_ARITY=255
> 45:  *                                 -DVarargsArrayTest.START_ARITY=250
> 46:  *                                 VarargsArrayTest

Same here.

But I mean are those timeouts ok with Xcomp? Are we sure that these timeouts are only a test issue and not a product issue?

-------------

PR Review: https://git.openjdk.org/jdk/pull/20404#pullrequestreview-2786415489
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2055532228
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2055535065
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2055537644
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2055540066
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2055548814
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2055550590
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2055554337
PR Review Comment: https://git.openjdk.org/jdk/pull/20404#discussion_r2055557452


More information about the hotspot-compiler-dev mailing list