RFR: 8305959: Improve itable_stub

Boris Ulasevich bulasevich at openjdk.org
Fri May 5 18:16:20 UTC 2023


On Thu, 13 Apr 2023 14:33:52 GMT, Boris Ulasevich <bulasevich at openjdk.org> wrote:

> Async profiler shows that applications spend up to 10% in itable_stubs.
> 
> The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found.
> 
> This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675).

Hi Andrew. Thank you.

The goal of this PR is to refactor repetitive code which can spend a significant amount of time scanning itables. I started looking into this because some applications spend a decent amount of time in this code.

The itable assembly stubs contain repetitive code - the current algorithm gets offsets and iterates over the itable data twice. I propose to do both lookups in a single pass over the interface table: once we have retrieved the interface klass pointer, we can perform both checks on it.

So the new algorithm consists of two loops. First, we look for a match to resolved_klass, checking for a match to holder_klass along the way. Then we continue iterating over itable using the second loop, checking for a match only with holder_klass.

This way we can almost double the performance of the itable lookup.

Here are some numbers on the OpenJDK micro-benchmarks that were also enhanced as part of this PR (ns/ops before|ns/ops after|difference).


CPU: Intel Xeon Platinum 8268
InterfaceCalls.test1stInt2Types    3.049    3.051   -0.07%
InterfaceCalls.test1stInt3Types    7.287    6.782    6.93%
InterfaceCalls.test1stInt5Types    7.324    6.596    9.94%
InterfaceCalls.test2ndInt2Types    3.542    3.456    2.43%
InterfaceCalls.test2ndInt3Types    8.234    7.376   10.42%
InterfaceCalls.test2ndInt5Types    8.349    7.425   11.07%
InterfaceCalls.testIfaceCall      35.035   29.413   16.05%
InterfaceCalls.testIfaceExtCall   40.061   32.32    19.31%
InterfaceCalls.testMonomorphic     2.644    2.652   -0.30%
geomean                            8.081    7.382    8.65%
           
CPU: AMD EPYC 7502P
InterfaceCalls.test1stInt2Types    5.157    5.135    0.43%
InterfaceCalls.test1stInt3Types    9.882    9.807    0.76%
InterfaceCalls.test1stInt5Types    9.864    9.802    0.63%
InterfaceCalls.test2ndInt2Types    6.664    5.432   18.49%
InterfaceCalls.test2ndInt3Types   10.411   10.046    3.51%
InterfaceCalls.test2ndInt5Types   10.49    10.075    3.96%
InterfaceCalls.testIfaceCall      46.789   46.72     0.15%
InterfaceCalls.testIfaceExtCall   50.724   46.55     8.23%
InterfaceCalls.testMonomorphic     4.823    4.826    0.06%
geomean                           11.724   11.233    4.19%

CPU: i7-1160G7
InterfaceCalls.test1stInt2Types    2.822    2.748    2.62%
InterfaceCalls.test1stInt3Types    5.701    5.309    6.88%
InterfaceCalls.test1stInt5Types    5.741    5.349    6.83%
InterfaceCalls.test2ndInt2Types    2.892    2.898   -0.21%
InterfaceCalls.test2ndInt3Types    6.666    5.858   12.12%
InterfaceCalls.test2ndInt5Types    6.686    5.851   12.49%
InterfaceCalls.testIfaceCall      26.992   24.302    9.97%
InterfaceCalls.testIfaceExtCall   33.12    27.053   18.32%
InterfaceCalls.testMonomorphic     2.415    2.455   -1.66%
geomean                            6.657    6.145    7.69%
           
CPU: i5-3320M
InterfaceCalls.test1stInt2Types   11.551   11.291    2.25%
InterfaceCalls.test1stInt3Types   65.911   34.574   47.54%
InterfaceCalls.test1stInt5Types   65.78    40.923   37.79%
InterfaceCalls.test2ndInt2Types   14.088   13.431    4.66%
InterfaceCalls.test2ndInt3Types   41.186   37.223    9.62%
InterfaceCalls.test2ndInt5Types   47.237   42.74     9.52%
InterfaceCalls.testIfaceCall     285.568  163.311   42.81%
InterfaceCalls.testIfaceExtCall  304.335  284.027    6.67%
InterfaceCalls.testMonomorphic    10.074    9.673    3.98%
geomean                           47.373   37.681   20.46%

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13460#issuecomment-1536607523


More information about the hotspot-dev mailing list