RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v4]
Vladimir Kozlov
kvn at openjdk.java.net
Wed Jan 5 18:20:12 UTC 2022
On Wed, 5 Jan 2022 08:59:00 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Patch extends existing macrologic inferencing algorithm to handle masked logic operations.
>>
>> Existing algorithm:
>>
>> 1. Identify logic cone roots.
>> 2. Packs parent and logic child nodes into a MacroLogic node in bottom up traversal if input constraint are met.
>> i.e. maximum number of inputs which a macro logic node can have.
>> 3. Perform symbolic evaluation of logic expression tree by assigning value corresponding to a truth table column
>> to each input.
>> 4. Inputs along with encoded function together represents a macro logic node which mimics a truth table.
>>
>> Modification:
>> Extended the packing algorithm to operate on both predicated or non-predicated logic nodes. Following
>> rules define the criteria under which nodes gets packed into a macro logic node:-
>>
>> 1. Parent and both child nodes are all unmasked or masked with same predicates.
>> 2. Masked parent can be packed with left child if it is predicated and both have same prediates.
>> 3. Masked parent can be packed with right child if its un-predicated or has matching predication condition.
>> 4. An unmasked parent can be packed with an unmasked child.
>>
>> New jtreg test case added with the patch exhaustively covers all the different combinations of predications of parent and
>> child nodes.
>>
>> Following are the performance number for JMH benchmark included with the patch.
>>
>> Machine Configuration: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (40C 2S Icelake Server)
>>
>> Benchmark | ARRAYLEN | Baseline (ops/s) | Withopt (ops/s) | Gain ( withopt/baseline)
>> -- | -- | -- | -- | --
>> o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 64 | 2365.421 | 5136.283 | 2.171403315
>> o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 128 | 2034.1 | 4073.381 | 2.002547072
>> o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 256 | 1568.694 | 2811.975 | 1.792558013
>> o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 512 | 883.261 | 1662.771 | 1.882536419
>> o.o.b.vm.compiler.MacroLogicOpt.workload1_caller | 1024 | 469.513 | 732.81 | 1.560787454
>> o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 64 | 273.049 | 552.106 | 2.022003377
>> o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 128 | 219.624 | 359.775 | 1.63814064
>> o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 256 | 131.649 | 182.23 | 1.384211046
>> o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 512 | 71.452 | 81.522 | 1.140933774
>> o.o.b.vm.compiler.MacroLogicOpt.workload2_caller | 1024 | 37.427 | 41.966 | 1.121276084
>> o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 64 | 2805.759 | 3383.16 | 1.205791374
>> o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 128 | 2069.012 | 2250.37 | 1.087654397
>> o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 256 | 1098.766 | 1101.996 | 1.002939661
>> o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 512 | 470.035 | 484.732 | 1.031267884
>> o.o.b.vm.compiler.MacroLogicOpt.workload3_caller | 1024 | 202.827 | 209.073 | 1.030794717
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt128 | 256 | 3435.989 | 4418.09 | 1.285827749
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt128 | 512 | 1524.803 | 1678.201 | 1.100601848
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt128 | 1024 | 972.501 | 1166.734 | 1.199725244
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt256 | 256 | 5980.85 | 7584.17 | 1.268075608
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt256 | 512 | 3258.108 | 3939.23 | 1.209054457
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt256 | 1024 | 1475.365 | 1511.159 | 1.024261115
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt512 | 256 | 4208.766 | 4220.678 | 1.002830283
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt512 | 512 | 2056.651 | 2049.489 | 0.99651764
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationInt512 | 1024 | 1110.461 | 1116.448 | 1.005391455
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong256 | 256 | 3259.348 | 3947.94 | 1.211266793
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong256 | 512 | 1515.147 | 1536.647 | 1.014190042
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong256 | 1024 | 911.58 | 1030.54 | 1.130498695
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong512 | 256 | 2034.611 | 2073.764 | 1.019243482
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong512 | 512 | 1110.659 | 1116.093 | 1.004892591
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.bitwiseBlendOperationLong512 | 1024 | 559.269 | 559.651 | 1.000683034
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt128 | 256 | 3636.141 | 4446.505 | 1.222863745
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt128 | 512 | 1433.145 | 1681.261 | 1.173126934
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt128 | 1024 | 1000.107 | 1172.866 | 1.172740517
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt256 | 256 | 5568.313 | 7670.259 | 1.37748345
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt256 | 512 | 3350.108 | 3927.803 | 1.172440709
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt256 | 1024 | 1495.966 | 1541.56 | 1.030477965
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt512 | 256 | 4230.379 | 4282.154 | 1.012238856
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt512 | 512 | 2029.801 | 2049.638 | 1.009772879
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsInt512 | 1024 | 1108.738 | 1118.897 | 1.00916267
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsLong256 | 256 | 3802.801 | 3783.537 | 0.99493426
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsLong256 | 512 | 1546.244 | 1552.691 | 1.004169458
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.maskedLogicOperationsLong256 | 1024 | 1017.512 | 1020.075 | 1.002518889
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt128 | 256 | 4159.835 | 4527.676 | 1.088426825
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt128 | 512 | 1665.335 | 1733.04 | 1.040655484
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt128 | 1024 | 1150.319 | 1181.935 | 1.02748455
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt256 | 256 | 6989.791 | 7382.883 | 1.056238019
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt256 | 512 | 3711.362 | 3911.921 | 1.054039191
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt256 | 1024 | 1540.341 | 1554.175 | 1.008981128
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt512 | 256 | 4164.559 | 4213.546 | 1.01176283
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt512 | 512 | 2072.91 | 2079.105 | 1.002988552
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsInt512 | 1024 | 1112.678 | 1116.675 | 1.003592234
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong256 | 256 | 3702.998 | 3906.093 | 1.0548461
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong256 | 512 | 1536.571 | 1546.043 | 1.006164375
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong256 | 1024 | 996.906 | 1013.649 | 1.016794964
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 | 256 | 2045.594 | 2048.966 | 1.001648421
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 | 512 | 1111.933 | 1117.689 | 1.005176571
>> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 | 1024 | 559.971 | 561.144 | 1.002094751
>>
>>
>> Kindly review and share your feedback.
>>
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> 8273322: Review comments resolution.
Looks good.
-------------
Marked as reviewed by kvn (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/6893
More information about the hotspot-compiler-dev
mailing list