Re: RFR: 8253049: Enhance itable_stub for AArch64 and x86_64

Kuai Wei kuaiwei.kw at alibaba-inc.com
Tue Sep 15 07:38:33 UTC 2020


Hi Vladimir,

  Thanks for your review.

  I updated my test cases in test/micro/org/openjdk/bench/vm/compiler/InterfaceCalls.java . My tests will not inline interface methods and most cpu are used by itable_stub.
every test will run 10 warmup iterations and 5 measure iterations for one score. I took 3 score for every test. 
  Below is test result on my machines, it looks slow loop has more improvement than origin one.

aarch64:
=== testStubPoly3 ===
orig: 38430308.215 38438769.040 38325616.152
opt : 39425275.311 39626194.985 39374242.065

=== testStubPoly5 ===
orig: 23227433.053 23210843.937 23212518.073
opt : 23805995.657 23797837.061 23861764.978

=== testSlowStubPoly3 ===
orig: 30838750.839 30886603.202 30841314.152
opt : 36166775.967 36242733.807 36041506.263

=== testSlowStubPoly5 ===
orig: 18713218.115 18706994.686 18686729.040
opt : 21827549.808 21836822.173 21861920.069

x86:
=== testStubPoly3 ===
orig: 36339726.912 36322863.060 36363196.132
opt : 38631086.341 38465649.400 38466044.926

=== testStubPoly5 ===
orig: 22240149.674 22218724.450 22225970.358
opt : 23498941.840 23454580.221 23497053.570

=== testSlowStubPoly3 ===
orig: 28693696.199 28700714.257 28587900.429
opt : 34187319.519 34171321.762 34138648.599

=== testSlowStubPoly5 ===
orig: 17388480.977 17389247.386 17177206.666
opt : 20697609.518 20771108.051 20699215.655

  I think lookup_interface_method can be reused as fast path. And it is also used by templateTable::invoke_interface and generate_method_handle_dispatch.
My implementation in slow path need more registers (6 registers so far), I need to check if there's register conflict in these methods. I'd like to keep a separate
slow path implementation. How do you think about it?

Thanks,
Kevin


------------------------------------------------------------------
From:Vladimir Ivanov <vladimir.x.ivanov at oracle.com>
Send Time:2020年9月14日(星期一) 22:10
To:kuaiwei <github.com+1981974+kuaiwei at openjdk.java.net>; hotspot-dev <hotspot-dev at openjdk.java.net>; hotspot-compiler-dev <hotspot-compiler-dev at openjdk.java.net>
Subject:Re: RFR: 8253049: Enhance itable_stub for AArch64 and x86_64

Hi Kevin,

Very interesting observations. I like the idea to optimize for the case 
when REFC == DECC.

Fusing 2 passes over the itable into one does look attractive, but I'm 
not sure the proposed variant is correct. I suggest to split the patch 
into 2 enhancements and handle them separately.

I'm curious what kind of benchmarks you used and what are the 
improvements observed with the patch.

One suggestion about the implementation:

src/hotspot/cpu/x86/macroAssembler_x86.cpp:

+void MacroAssembler::lookup_interface_method_in_stub(Register recv_klass,

I'd like to avoid having 2 independent implementations of itable lookup 
(MacroAssembler::lookup_interface_method_in_stub() and 
MacroAssembler::lookup_interface_method()). It would be nice to keep the 
implementation unified between itable and MethodHandle linkToInterface 
linker stubs.

What MacroAssembler::lookup_interface_method(..., true 
/*return_method*/) does is interface method lookup w/o proper subtype 
check and it is equivalent to fast loop in 
MacroAssembler::lookup_interface_method_in_stub().

As a possible path forward, you could introduce the fast path check 
first by moving the fast path check into 
VtableStubs::create_itable_stub() and guard the first path over the 
itable. It would make the type checking pass over itable optional based 
on runtime check.

Then you could refactor MacroAssembler::lookup_interface_method() to 
optionally do REFC and DECC checks on every iteration and migrate 
VtableStubs::create_itable_stub()  and 
MethodHandles::generate_method_handle_dispatch() to it.

Best regards,
Vladimir Ivanov

On 14.09.2020 13:52, kuaiwei wrote:
> Now itable_stub will go through instanceKlass's itable twice to look up a method entry. resolved klass is used for type
> checking and method holder klass is used to find method entry. In many cases , we observed resolved klass is as same as
> holder klass. So we can improve itable stub based on it. If they are same klass, stub uses a fast loop to check only
> one klass. If not, a slow loop is used to checking both klasses.
> 
> Even entering in slow loop, new implementation can be better than old one in some cases. Because new stub just need go
> through itable once and reduce memory operations.
> 
> 
> bug: https://bugs.openjdk.java.net/browse/JDK-8253049
> 
> -------------
> 
> Commit messages:
>   - 8253049: Enhance itable_stub for AArch64 and x86_64
> 
> Changes: https://git.openjdk.java.net/jdk/pull/128/files
>   Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=128&range=00
>    Issue: https://bugs.openjdk.java.net/browse/JDK-8253049
>    Stats: 220 lines in 7 files changed: 172 ins; 35 del; 13 mod
>    Patch: https://git.openjdk.java.net/jdk/pull/128.diff
>    Fetch: git fetch https://git.openjdk.java.net/jdk pull/128/head:pull/128
> 
> PR: https://git.openjdk.java.net/jdk/pull/128
> 



More information about the hotspot-compiler-dev mailing list