RFR: 8271820: Implementation of JEP 416: Reimplement Core Reflection with Method Handle [v12]
Mandy Chung
mchung at openjdk.java.net
Sat Oct 9 00:08:08 UTC 2021
On Fri, 8 Oct 2021 23:31:32 GMT, Mandy Chung <mchung at openjdk.org> wrote:
>> This reimplements core reflection with method handles.
>>
>> For `Constructor::newInstance` and `Method::invoke`, the new implementation uses `MethodHandle`. For `Field` accessor, the new implementation uses `VarHandle`. For the first few invocations of one of these reflective methods on a specific reflective object we invoke the corresponding method handle directly. After that we spin a dynamic bytecode stub defined in a hidden class which loads the target `MethodHandle` or `VarHandle` from its class data as a dynamically computed constant. Loading the method handle from a constant allows JIT to inline the method-handle invocation in order to achieve good performance.
>>
>> The VM's native reflection methods are needed during early startup, before the method-handle mechanism is initialized. That happens soon after System::initPhase1 and before System::initPhase2, after which we switch to using method handles exclusively.
>>
>> The core reflection and method handle implementation are updated to handle chained caller-sensitive method calls [1] properly. A caller-sensitive method can define with a caller-sensitive adapter method that will take an additional caller class parameter and the adapter method will be annotated with `@CallerSensitiveAdapter` for better auditing. See the detailed description from [2].
>>
>> Ran tier1-tier8 tests.
>>
>> [1] https://bugs.openjdk.java.net/browse/JDK-8013527
>> [2] https://bugs.openjdk.java.net/browse/JDK-8271820?focusedCommentId=14439430&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14439430
>
> Mandy Chung has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix left-over assignment
[8cb8071](https://github.com/openjdk/jdk/pull/5027/commits/8cb8071d9a085349139215c8472730193650b247) adds the setup code to pollute the profile to avoid unwanted inlining in some cases. The benchmark numbers
are now sensible where the `Var` cases should not perform better than `Const` cases. I observed that if `polluteProfile` is false, some `Var` cases perform better than `Const` cases.
Updated performance result:
Baseline (jdk-18+17)
Benchmark Mode Cnt Score Error Units
ReflectionSpeedBenchmark.constructorConst avgt 10 68.049 ± 0.872 ns/op
ReflectionSpeedBenchmark.constructorPoly avgt 10 94.132 ± 1.805 ns/op
ReflectionSpeedBenchmark.constructorVar avgt 10 64.543 ± 0.799 ns/op
ReflectionSpeedBenchmark.instanceFieldConst avgt 10 35.361 ± 0.492 ns/op
ReflectionSpeedBenchmark.instanceFieldPoly avgt 10 67.089 ± 3.288 ns/op
ReflectionSpeedBenchmark.instanceFieldVar avgt 10 35.745 ± 0.554 ns/op
ReflectionSpeedBenchmark.instanceMethodConst avgt 10 77.925 ± 2.026 ns/op
ReflectionSpeedBenchmark.instanceMethodPoly avgt 10 96.094 ± 2.269 ns/op
ReflectionSpeedBenchmark.instanceMethodVar avgt 10 80.002 ± 4.267 ns/op
ReflectionSpeedBenchmark.staticFieldConst avgt 10 33.442 ± 2.659 ns/op
ReflectionSpeedBenchmark.staticFieldPoly avgt 10 51.918 ± 1.522 ns/op
ReflectionSpeedBenchmark.staticFieldVar avgt 10 33.967 ± 0.451 ns/op
ReflectionSpeedBenchmark.staticMethodConst avgt 10 75.380 ± 1.660 ns/op
ReflectionSpeedBenchmark.staticMethodPoly avgt 10 93.553 ± 1.037 ns/op
ReflectionSpeedBenchmark.staticMethodVar avgt 10 76.728 ± 1.614 ns/op
JEP 417
Benchmark Mode Cnt Score Error Units
ReflectionSpeedBenchmark.constructorConst avgt 10 32.392 ± 0.473 ns/op
ReflectionSpeedBenchmark.constructorPoly avgt 10 113.947 ± 1.205 ns/op
ReflectionSpeedBenchmark.constructorVar avgt 10 76.885 ± 1.128 ns/op
ReflectionSpeedBenchmark.instanceFieldConst avgt 10 18.569 ± 0.161 ns/op
ReflectionSpeedBenchmark.instanceFieldPoly avgt 10 98.671 ± 2.015 ns/op
ReflectionSpeedBenchmark.instanceFieldVar avgt 10 54.193 ± 3.510 ns/op
ReflectionSpeedBenchmark.instanceMethodConst avgt 10 33.421 ± 0.406 ns/op
ReflectionSpeedBenchmark.instanceMethodPoly avgt 10 109.129 ± 1.959 ns/op
ReflectionSpeedBenchmark.instanceMethodVar avgt 10 90.420 ± 2.187 ns/op
ReflectionSpeedBenchmark.staticFieldConst avgt 10 19.080 ± 0.179 ns/op
ReflectionSpeedBenchmark.staticFieldPoly avgt 10 92.130 ± 2.729 ns/op
ReflectionSpeedBenchmark.staticFieldVar avgt 10 53.899 ± 1.051 ns/op
ReflectionSpeedBenchmark.staticMethodConst avgt 10 35.907 ± 0.456 ns/op
ReflectionSpeedBenchmark.staticMethodPoly avgt 10 102.895 ± 1.604 ns/op
ReflectionSpeedBenchmark.staticMethodVar avgt 10 82.123 ± 0.629 ns/op
I also ran the following benchmarks which show no performance degradation:
- Peter's custom JSON serialization and deserialization benchmark with Jackson
- XStream converter type benchmark
- Kryo field serializer benchmark
Microbenchmarks show that the performance of the new implementation is significantly faster than the old implementation (43-57% faster). When `Field`, `Method`, and `Constructor` instances are held in
non-constant fields (e.g. non-final field or an array element), microbenchmarks show some performance degradation.
While the performance degradation is signifcant (51-77%) for field access when `Field` instances cannot be constant folded, it might not impact real-world applications.
We will need the developers to test with early-access builds to help identify any behavior and performance regressions. We will also explore a post-JEP integration performance improvements such as refining bytecode shapes for field access to enable concrete `MethodHandle` or `VarHandle` be reliably optimized by JIT irrespective of whether receiver is constant or not.
@plevart @cl4es if you agree, can you please do the (hopefully) final review.
-------------
PR: https://git.openjdk.java.net/jdk/pull/5027
More information about the core-libs-dev
mailing list