RFR: 8345766: C2 should emit macro nodes for ModF/ModD instead of calls during parsing

Emanuel Peter epeter at openjdk.org
Wed Jan 8 08:25:47 UTC 2025


On Tue, 17 Dec 2024 09:01:57 GMT, Theo Weidmann <tweidmann at openjdk.org> wrote:

> C2 currently emits runtime calls if the platform rules do not support lowering floating point remainder operations. For example, for float:
> 
> https://github.com/openjdk/jdk/blob/fbbc7c35f422294090b8c7a02a19ab2fb67c7070/src/hotspot/share/opto/parse2.cpp#L2305-L2318
> 
> https://github.com/openjdk/jdk/blob/fbbc7c35f422294090b8c7a02a19ab2fb67c7070/src/hotspot/share/opto/parse2.cpp#L1099-L1109
> 
> The only platform, which currently supports this, however, is x86_32. On all other platforms, runtime calls are generated directly during parsing, which prevent any constant folding or other idealizations. Even C1 can perform these optimizations, resulting in significantly lower C2 performance compared to C1 for simple test cases. This function was observed to be around 15x slower with C2 compared to C1 due to redundant runtime calls:
> 
> 
> public static double process(final double x) {
>         double w = (double) 0.1;
>         double p = 0;
>         p = (double) (3.109615012413746E307 % (w % Z));
>         p = (double) (7.614949555185036E307 / (x % x)); // <- return value only dependends on this line
>         return (double) (x * p);
> }
> 
> 
> To fix this, this PR turns ModFNode and ModDNode into macros, which are always created during parsing. They support idealization (constant folding) and are lowered to runtime calls during macro expansion. For simplicity, these operations will now also call into the runtime on x86_32, as this platform is deprecated.

Thanks for working on this! Generally looks reasonable to me, though I left a few comments.

src/hotspot/share/opto/divnode.cpp line 61:

> 59:   init_req(TypeFunc::Parms + 0, a);
> 60:   init_req(TypeFunc::Parms + 1, b);
> 61: }

Is there a reason to put this in the cpp file? I think I usually see constructors for Nodes in the hpp file. Nitpicky sorry 🙈

src/hotspot/share/opto/divnode.cpp line 1400:

> 1398: 
> 1399: //=============================================================================
> 1400: //------------------------------Idealize---------------------------------------

I would just remove these lines if you are already toughing the code here:
`//------------------------------Idealize---------------------------------------`

src/hotspot/share/opto/divnode.cpp line 1401:

> 1399: //=============================================================================
> 1400: //------------------------------Idealize---------------------------------------
> 1401: Node *UModLNode::Ideal(PhaseGVN *phase, bool can_reshape) {

Suggestion:

Node* UModLNode::Ideal(PhaseGVN* phase, bool can_reshape) {

Ah, maybe you did not mean to touch it, but on GitHub it looks like you did... Maybe you just reordered things.

src/hotspot/share/opto/divnode.cpp line 1409:

> 1407: }
> 1408: 
> 1409: Node* ModFNode::Ideal(PhaseGVN* phase, bool can_reshape) {

Can you quickly say why you congerted this from a `Value` to an `Ideal` method? I guess it is because before it used to be a simple `Node` with a single output, but now it is a `Call` with multiple outputs... Ok makes sense.

src/hotspot/share/opto/divnode.cpp line 1419:

> 1417:   if (t1 == Type::TOP || t2 == Type::TOP) {
> 1418:     return nullptr;
> 1419:   }

The comment seems to contradict the result. You could return `C->top()`, right?

src/hotspot/share/opto/macro.cpp line 2224:

> 2222: 
> 2223:   // Make slow path call
> 2224:   CallNode *call = make_slow_call(lock, OptoRuntime::complete_monitor_enter_Type(),

Suggestion:

  CallNode* call = make_slow_call(lock, OptoRuntime::complete_monitor_enter_Type(),

src/hotspot/share/opto/macro.cpp line 2592:

> 2590:         CallNode* mod_macro = n->as_Call();
> 2591:         CallNode* call = new CallLeafNode(mod_macro->tf(),
> 2592:                                           is_drem ? CAST_FROM_FN_PTR(address, SharedRuntime::drem) : CAST_FROM_FN_PTR(address, SharedRuntime::frem),

Suggestion:

                                          is_drem ? CAST_FROM_FN_PTR(address, SharedRuntime::drem)
                                                  : CAST_FROM_FN_PTR(address, SharedRuntime::frem),

The line is a little long.

src/hotspot/share/opto/parse.hpp line 533:

> 531: 
> 532:   // Helper functions for shifting & arithmetic
> 533:   Node* floating_point_mod(Node* a, Node* b, bool dbl);

Please use a more descriptive name than `dbl` ;)

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/22786#pullrequestreview-2536297498
PR Review Comment: https://git.openjdk.org/jdk/pull/22786#discussion_r1906663110
PR Review Comment: https://git.openjdk.org/jdk/pull/22786#discussion_r1906668157
PR Review Comment: https://git.openjdk.org/jdk/pull/22786#discussion_r1906669805
PR Review Comment: https://git.openjdk.org/jdk/pull/22786#discussion_r1906679723
PR Review Comment: https://git.openjdk.org/jdk/pull/22786#discussion_r1906676399
PR Review Comment: https://git.openjdk.org/jdk/pull/22786#discussion_r1906688741
PR Review Comment: https://git.openjdk.org/jdk/pull/22786#discussion_r1906693395
PR Review Comment: https://git.openjdk.org/jdk/pull/22786#discussion_r1906700593


More information about the hotspot-compiler-dev mailing list