[Integrated] [lworld] RFR: 8246603: [lworld] C2 does not scalarize inline types wrapped into non-escaping box objects
Tobias Hartmann
thartmann at openjdk.java.net
Fri Jun 12 12:15:33 UTC 2020
On Tue, 9 Jun 2020 14:25:05 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:
> C2 fails to scalarize inline types wrapped into non-inline, non-escaping (box) objects.
>
> For example, in TestLWorld::test109, C2 successfully scalar replaces the InterfaceBox object but not the LongWrapper
> object it contains because of the complex control flow in LongWrapper::wrap. However, since LongWrapper is an inline
> type, we don't need to rely on Escape Analysis to be able to scalar replace. Now the problem is that LongWrapper is
> stored as oop in a field of type WrapperInterface and we don't keep track of the ValueType(Ptr)Node long enough (i.e.
> until after EA) for the load to be removed and the buffer allocation to go away. The fix contains the following
> changes:
> - Use ValueTypePtrNode instead of the oop whenever possible to keep track of field values. PhiNode::Ideal will then push
> such ValueTypePtrNode down and LoadNode::Identity will fold the loads.
> - Keep ValueTypePtrNodes such that we can still fold loads after EA removed potential non-inline, wrapper objects that
> prevented scalarization during parsing. Only remove them after EA is done.
> - Piggy-backing on PhaseMacroExpand::eliminate_allocate_node to eliminate unused inline type allocations and removed
> Allocate::Ideal which is not needed anymore (it also did not remove allocations that still had initializing stores).
> - Added code to remove the membar added after inline type allocation (it will otherwise block loop opts).
> - Make sure phis are always split if all inputs are mergemems to remove useless memory merges that block optimizations
> (see JDK-8247216)
> - Added regression tests and a benchmark (provided by Maurizio)
>
> Performance without fix:
>
> Benchmark Mode Cnt Score Error Units
> TestBoxing.pojo_loop avgt 30 4.699 ± 0.045 ms/op
> TestBoxing.box_generic_loop avgt 30 4.540 ± 0.058 ms/op
> TestBoxing.box_inline_loop avgt 30 0.527 ± 0.009 ms/op
> TestBoxing.box_intf_loop avgt 30 4.512 ± 0.050 ms/op
> TestBoxing.box_ref_loop avgt 30 4.551 ± 0.037 ms/op
> TestBoxing.inline_loop avgt 30 0.524 ± 0.013 ms/op
>
> Performance with fix:
>
> Benchmark Mode Cnt Score Error Units
> TestBoxing.pojo_loop avgt 30 4.818 ± 0.166 ms/op
> TestBoxing.box_generic_loop avgt 30 0.517 ± 0.007 ms/op
> TestBoxing.box_inline_loop avgt 30 0.513 ± 0.007 ms/op
> TestBoxing.box_intf_loop avgt 30 0.523 ± 0.024 ms/op
> TestBoxing.box_ref_loop avgt 30 0.511 ± 0.010 ms/op
> TestBoxing.inline_loop avgt 30 0.514 ± 0.012 ms/op
This pull request has now been integrated.
Changeset: f187e9db
Author: Tobias Hartmann <thartmann at openjdk.org>
URL: https://git.openjdk.java.net/valhalla/commit/f187e9db
Stats: 811 lines in 19 files changed: 93 ins; 653 del; 65 mod
8246603: [lworld] C2 does not scalarize inline types wrapped into non-escaping box objects
-------------
PR: https://git.openjdk.java.net/valhalla/pull/71
More information about the valhalla-dev
mailing list