RFR: 8274983: C1 optimizes the invocation of private interface methods [v2]
Xin Liu
xliu at openjdk.java.net
Wed Nov 24 07:48:37 UTC 2021
> The root cause of the C1 regression is that some regex generate multiple classes which all implement
> an interface. In SlowStartupTest.java, the following **invokeinterface** happens frequently with different receivers. the target is a private interface method.
>
>
> 9: invokeinterface #25, 3 // InterfaceMethod java/util/regex/Pattern$BmpCharPredicate.lambda$union$2:(Ljava/util/regex/Pattern$CharPredicate;I)Z
>
>
> This patch allows c1 to generate the optimized virtual call for invokeinterface
> whose targets are the private interface methods.
>
> Before JDK-823835, LambdaMetaFactory generates invokespecial in this case. Because the private
> interface methods can not be overrided, c1 generates the optimized virtual call. After JDK-823835,
> LambdaMetaFactory generates invokeinterface instead. C1 generates the regular virtual call because
> it can not recognize the new pattern. If a multiple of subclasses all implement a same interface,
> it is possible that they trash the IC stub using their own concrete klass in runtime.
>
> Optimized virtual call uses relocInfo::opt_virtual_call_type(3), It will call VM
> 'resolve_opt_virtual_call_C' once and resolve the target to the VEP of the nmethod.
> Therefore, this patch can prevent the callsite from trashing.
>
> Before this patch, SlowStartupTest had 38770 times _resolve_invoke_virtual_cnt and 38695 _handle_wrong_method_cnt per 10k iterations. To dump `C1Statistics`, we use fastdebug build for comparison.
>
>
> $java -XX:TieredStopAtLevel=1 -XX:+PrintC1Statistics SlowStartupTest 1
> Executed 10000 iterations in 736ms
> C1 Runtime statistics:
> _resolve_invoke_virtual_cnt: 38770
> _resolve_invoke_opt_virtual_cnt: 186
> _resolve_invoke_static_cnt: 44
> _handle_wrong_method_cnt: 38695
> _ic_miss_cnt: 35
>
>
> With this patch, only 1 _handle_wrong_method_cnt is triggered but we have 3 more `_resolve_invoke_opt_virtual_cnt` events instead. The total runtime reduces from 736ms to 9ms.
>
>
> $java -XX:TieredStopAtLevel=1 -XX:+PrintC1Statistics SlowStartupTest 1
> Executed 10000 iterations in 9ms
> C1 Runtime statistics:
> _resolve_invoke_virtual_cnt: 77
> _resolve_invoke_opt_virtual_cnt: 189
> _resolve_invoke_static_cnt: 45
> _handle_wrong_method_cnt: 1
> _ic_miss_cnt: 39
>
>
> Codegen wise, before the patch, C1 generates LIR for the invokeinterface whose target is a private interface methoda as follows.
>
> __bci__use__tid____instr____________________________________
> . 1 0 v2 a1.invokeinterface()
> InvokePrivateInterfaceMethod$I.bar()V
> . 6 0 v3 return
>
>
> With this patch, C1 generates LIR as follows. it first check a1 is a subtype of `InvokePrivateInterfaceMethod$I`. if so, an optimized virtual call is generated. The callsite will be fixed up once and only one time in runtime.
>
> __bci__use__tid____instr____________________________________
> . 1 1 a2 checkcast(a1) InvokePrivateInterfaceMethod$I
> stack [0:a1]
> . 1 0 v3 a2.invokeinterface()
> InvokePrivateInterfaceMethod$I.bar()V
> . 6 0 v4 return
Xin Liu has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
- 8274983: C1 optimizes the invocation of private interface methods
- Merge branch 'master' into JDK-8274983
- 8274983: Pattern.matcher performance regression after JDK-823835
This patch allows c1 to generate the optimized virtual call for invokeinterface
whose targets are the private interface methods.
Before JDK-823835, LambdaMetaFactory generates invokespecial in this case. Because the private
interface methods can not be overrided, c1 generates the optimized virtual call. After JDK-823835,
LambdaMetaFactory generates invokeinterface instead. C1 generates the regular virtual call because
it can not recognize the new pattern. If a multiple of subclasses all implement a same interface,
it is possible that they trash the IC stub using their own concrete klass in runtime.
Optimized virtual call uses relocInfo::opt_virtual_call_type(3), It will call VM
'resolve_opt_virtual_call_C' once and resolve the target to the VEP of the nmethod.
Therefore, this patch can prevent the callsite from trashing.
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/6445/files
- new: https://git.openjdk.java.net/jdk/pull/6445/files/5a00e1f7..acf7b9f8
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6445&range=01
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6445&range=00-01
Stats: 11800 lines in 386 files changed: 9368 ins; 1058 del; 1374 mod
Patch: https://git.openjdk.java.net/jdk/pull/6445.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/6445/head:pull/6445
PR: https://git.openjdk.java.net/jdk/pull/6445
More information about the hotspot-compiler-dev
mailing list