RFR: 8291336: Add ideal rule to convert floating point multiply by 2 into addition

Sat Jul 30 01:58:17 UTC 2022

On Tue, 26 Jul 2022 15:39:42 GMT, SuperCoder79 <duke at openjdk.org> wrote:

> Hello,
> I would like to propose an ideal transform that converts floating point multiply by 2 (`x * 2`) into an addition operation instead. This would allow for the elimination of the memory reference for the constant two, and keep the whole operation inside registers. My justifications for this optimization include:
> * As per [Agner Fog's instruction tables](https://www.agner.org/optimize/instruction_tables.pdf) many older systems, such as the sandy bridge and ivy bridge architectures, have different latencies for addition and multiplication meaning this change could have beneficial effects when in hot code.
> * The removal of the memory load would have a beneficial effect in cache bound situations.
> * Multiplication by 2 is relatively common construct so this change can apply to a wide range of Java code.
> 
> As this is my first time looking into the c2 codebase, I have a few lingering questions about my implementation and how certain parts of the compiler work. Mainly, is this patch getting the type of the operands correctly? I saw some cases where code used `bottom_type()` and other cases where it used `phase->type(value)`. Similarly, are nodes able to be reused as is being done in the AddNode constructors? I saw some places where the clone method was being used, but other places where it wasn't.
> 
> I have attached an IR test and a jmh benchmark. Tier 1 testing passes on my machine.
> 
> Thanks for your time,
> Jasmine

I leave some reviews for the patch, can you show the results of the added microbenchmark, please?

Thanks

There you go: https://bugs.openjdk.org/browse/JDK-8291336

Next time you could ask for help in the appropriate mailing list (this time it is hotspot-compiler-dev) or submit a bug through https://bugreport.java.com/bugreport/

Also please enable github action in your fork so that the patches get tested automatically at tier 1 on major platforms.

Hope this helps.

src/hotspot/share/opto/mulnode.cpp line 439:

> 437: // Check to see if we are multiplying by a constant 2 and convert to add, then try the regular MulNode::Ideal
> 438: Node *MulFNode::Ideal(PhaseGVN *phase, bool can_reshape) {
> 439:   const TypeF *t1 = in(1)->bottom_type()->isa_float_constant();

`phase->type(Node*)` refers to the type inferred by the GVN in this phase, while `Node::bottom_type()` refers to the loosest type the node can have. For example, the bottom type of an `AddINode` is always `TypeInt::INT` (every `int` value possible) while the GVN can ensure a stricter type if it knows both the inputs are integers between 0 and 10. In this case, you obtain the correct type nonetheless because `ConNode` extends `TypeNode`, a family of nodes which has their bottom types updated by the GVN. In general, in idealisation, it is more efficient to use `phase->type(Node*)`.

src/hotspot/share/opto/mulnode.cpp line 442:

> 440:   const TypeF *t2 = in(2)->bottom_type()->isa_float_constant();
> 441: 
> 442:   // x * 2 -> x + x

Since constants are always pushed to the right of the expression, you don't need to try both permutations of the pattern.

test/hotspot/jtreg/compiler/c2/irTests/TestMulBy2.java line 37:

> 35:  * @run driver compiler.c2.irTests.TestMulBy2
> 36:  */
> 37: public class TestMulBy2 {

Please use a more general name such as `MulFNodeIdealizationTests`

-------------

PR: https://git.openjdk.org/jdk/pull/9642