Re: RFR: 8253049: Enhance itable_stub for AArch64 and x86_64
Kuai Wei
kuaiwei.kw at alibaba-inc.com
Tue Sep 15 07:38:33 UTC 2020
Hi Vladimir,
Thanks for your review.
I updated my test cases in test/micro/org/openjdk/bench/vm/compiler/InterfaceCalls.java . My tests will not inline interface methods and most cpu are used by itable_stub.
every test will run 10 warmup iterations and 5 measure iterations for one score. I took 3 score for every test.
Below is test result on my machines, it looks slow loop has more improvement than origin one.
aarch64:
=== testStubPoly3 ===
orig: 38430308.215 38438769.040 38325616.152
opt : 39425275.311 39626194.985 39374242.065
=== testStubPoly5 ===
orig: 23227433.053 23210843.937 23212518.073
opt : 23805995.657 23797837.061 23861764.978
=== testSlowStubPoly3 ===
orig: 30838750.839 30886603.202 30841314.152
opt : 36166775.967 36242733.807 36041506.263
=== testSlowStubPoly5 ===
orig: 18713218.115 18706994.686 18686729.040
opt : 21827549.808 21836822.173 21861920.069
x86:
=== testStubPoly3 ===
orig: 36339726.912 36322863.060 36363196.132
opt : 38631086.341 38465649.400 38466044.926
=== testStubPoly5 ===
orig: 22240149.674 22218724.450 22225970.358
opt : 23498941.840 23454580.221 23497053.570
=== testSlowStubPoly3 ===
orig: 28693696.199 28700714.257 28587900.429
opt : 34187319.519 34171321.762 34138648.599
=== testSlowStubPoly5 ===
orig: 17388480.977 17389247.386 17177206.666
opt : 20697609.518 20771108.051 20699215.655
I think lookup_interface_method can be reused as fast path. And it is also used by templateTable::invoke_interface and generate_method_handle_dispatch.
My implementation in slow path need more registers (6 registers so far), I need to check if there's register conflict in these methods. I'd like to keep a separate
slow path implementation. How do you think about it?
Thanks,
Kevin
------------------------------------------------------------------
From:Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
Send Time:2020年9月14日(星期一) 22:10
To:kuaiwei <github.com+1981974+kuaiwei at openjdk.java.net>; hotspot-dev <hotspot-dev at openjdk.java.net>; hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>
Subject:Re: RFR: 8253049: Enhance itable_stub for AArch64 and x86_64
Hi Kevin,
Very interesting observations. I like the idea to optimize for the case
when REFC == DECC.
Fusing 2 passes over the itable into one does look attractive, but I'm
not sure the proposed variant is correct. I suggest to split the patch
into 2 enhancements and handle them separately.
I'm curious what kind of benchmarks you used and what are the
improvements observed with the patch.
One suggestion about the implementation:
src/hotspot/cpu/x86/macroAssembler_x86.cpp:
+void MacroAssembler::lookup_interface_method_in_stub(Register recv_klass,
I'd like to avoid having 2 independent implementations of itable lookup
(MacroAssembler::lookup_interface_method_in_stub() and
MacroAssembler::lookup_interface_method()). It would be nice to keep the
implementation unified between itable and MethodHandle linkToInterface
linker stubs.
What MacroAssembler::lookup_interface_method(..., true
/*return_method*/) does is interface method lookup w/o proper subtype
check and it is equivalent to fast loop in
MacroAssembler::lookup_interface_method_in_stub().
As a possible path forward, you could introduce the fast path check
first by moving the fast path check into
VtableStubs::create_itable_stub() and guard the first path over the
itable. It would make the type checking pass over itable optional based
on runtime check.
Then you could refactor MacroAssembler::lookup_interface_method() to
optionally do REFC and DECC checks on every iteration and migrate
VtableStubs::create_itable_stub() and
MethodHandles::generate_method_handle_dispatch() to it.
Best regards,
Vladimir Ivanov
On 14.09.2020 13:52, kuaiwei wrote:
> Now itable_stub will go through instanceKlass's itable twice to look up a method entry. resolved klass is used for type
> checking and method holder klass is used to find method entry. In many cases , we observed resolved klass is as same as
> holder klass. So we can improve itable stub based on it. If they are same klass, stub uses a fast loop to check only
> one klass. If not, a slow loop is used to checking both klasses.
>
> Even entering in slow loop, new implementation can be better than old one in some cases. Because new stub just need go
> through itable once and reduce memory operations.
>
>
> bug: https://bugs.openjdk.java.net/browse/JDK-8253049
>
> -------------
>
> Commit messages:
> - 8253049: Enhance itable_stub for AArch64 and x86_64
>
> Changes: https://git.openjdk.java.net/jdk/pull/128/files
> Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=128&range=00
> Issue: https://bugs.openjdk.java.net/browse/JDK-8253049
> Stats: 220 lines in 7 files changed: 172 ins; 35 del; 13 mod
> Patch: https://git.openjdk.java.net/jdk/pull/128.diff
> Fetch: git fetch https://git.openjdk.java.net/jdk pull/128/head:pull/128
>
> PR: https://git.openjdk.java.net/jdk/pull/128
>
More information about the hotspot-compiler-dev
mailing list