RFR: 8341293: Split field loads through Nested Phis [v2]
Emanuel Peter
epeter at openjdk.org
Wed Dec 4 07:59:43 UTC 2024
On Thu, 21 Nov 2024 21:24:37 GMT, Dhamoder Nalla <dhanalla at openjdk.org> wrote:
>> As an extension of the work done as part of https://github.com/openjdk/jdk/pull/12897, split the field loads (AddP -> Load*) with nested phi parent nodes to enable more scalar replacements, thereby reducing memory allocation.
>>
>>
>> Here are the sequence of Ideal graph transformations for Nested phi:
>>
>>
>> 
>>
>> 
>>
>> 
>>
>> JMH results:
>> with disabled RAM
>>
>> Benchmark Mode Cnt Score Error Units
>> NestedPhiAndRematerialize.NopRAM.testBailOut_runner avgt 15 13.969 ± 0.248 ms/op
>> NestedPhiAndRematerialize.NopRAM.testFieldEscapeWithMerge_runner avgt 15 80.300 ± 4.306 ms/op
>> NestedPhiAndRematerialize.NopRAM.testMerge_TryCatchFinally_runner avgt 15 72.182 ± 1.781 ms/op
>> NestedPhiAndRematerialize.NopRAM.testMultiParentPhi_runner avgt 15 2.983 ± 0.001 ms/op
>> NestedPhiAndRematerialize.NopRAM.testNestedPhiPolymorphic_runner avgt 15 18.342 ± 0.731 ms/op
>> NestedPhiAndRematerialize.NopRAM.testNestedPhiProcessOrder_runner avgt 15 14.315 ± 0.443 ms/op
>> NestedPhiAndRematerialize.NopRAM.testNestedPhiWithLambda_runner avgt 15 18.511 ± 1.212 ms/op
>> NestedPhiAndRematerialize.NopRAM.testNestedPhiWithTrap_runner avgt 15 66.277 ± 1.478 ms/op
>> NestedPhiAndRematerialize.NopRAM.testNestedPhi_FieldLoad_runner avgt 15 17.968 ± 0.306 ms/op
>> NestedPhiAndRematerialize.NopRAM.testNestedPhi_TryCatch_runner avgt 15 14.186 ± 0.247 ms/op
>> NestedPhiAndRematerialize.NopRAM.testRematerialize_MultiObj_runner avgt 15 88.435 ± 4.869 ms/op
>> NestedPhiAndRematerialize.NopRAM.testRematerialize_SingleObj_runner avgt 15 29560.130 ± 48.797 ms/op
>> NestedPhiAndRematerialize.NopRAM.testRematerialize_TryCatch_runner avgt 15 49.150 ± 2.307 ms/op
>> NestedPhiAndRematerialize.NopRAM.testThreeLevelNestedPhi_runner avgt 15 18.236 ± 0.308 ms/op
>>
>> with enabled RAM
>> Benchmark Mode Cnt Score Error Units
>> NestedPhiAndRematerialize.YesRAM.testBailOut_runner avgt 15 3.257 ± 0.423 ms/op
>> NestedPhiAndRematerialize.YesRAM.testFieldEscapeWithMerge_runner avgt 15 79.916 ± 3.477 ms/op
>> NestedPhiAndRematerialize.YesRAM.testMerge_TryCatchFinally_runner avgt 15 72.053 ± 1.916 ms/op
>> NestedPhiAndRematerialize.YesRAM.testMultiParentPhi_runner avgt 15 2.984 ± 0.001 ms/op
>> NestedPhiAndRematerialize.YesRAM.testNestedPhiPolymorphic_runner avgt ...
>
> Dhamoder Nalla has updated the pull request incrementally with one additional commit since the last revision:
>
> CR feedback
test/hotspot/jtreg/compiler/c2/irTests/scalarReplacement/AllocationMergesNestedPhiTests.java line 31:
> 29: /*
> 30: * @test
> 31: * @bug 8281429
Is this bug id correct?
test/hotspot/jtreg/compiler/c2/irTests/scalarReplacement/AllocationMergesNestedPhiTests.java line 34:
> 32: * @summary Tests that C2 can correctly scalar replace some object allocation merges.
> 33: * @library /test/lib /
> 34: * @requires vm.debug == true & vm.flagless & vm.bits == 64 & vm.compiler2.enabled & vm.opt.final.EliminateAllocations
Do you need all of these? Or is it just that IR rules are failing otherwise?
If it is just about the IR rules, you can restrict IR rules with `applyIf...`
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21270#discussion_r1868888712
PR Review Comment: https://git.openjdk.org/jdk/pull/21270#discussion_r1868892077
More information about the hotspot-compiler-dev
mailing list