Discussion: 8172978: Remove Interpreter TOS optimization
Max Ockner
max.ockner at oracle.com
Wed Feb 15 22:18:50 UTC 2017
Hello all,
We have filed a bug to remove the interpreter stack caching optimization
for jdk10. Ideally we can make this change *early* during the jdk10
development cycle. See below for justification:
Bug: https://bugs.openjdk.java.net/browse/JDK-8172978
Stack caching has been around for a long time and is intended to replace
some of the load/store (pop/push) operations with corresponding register
operations. The need for this optimization arose before caching could
adequately lessen the burden of memory access. We have reevaluated the
JVM stack caching optimization and have found that it has a high memory
footprint and is very costly to maintain, but does not provide
significant measurable or theoretical benefit for us when used with
modern hardware.
Minimal Theoretical Benefit.
Because modern hardware does not slap us with the same cost for
accessing memory as it once did, the benefit of replacing memory access
with register access is far less dramatic now than it once was.
Additionally, the interpreter runs for a relatively short time before
relevant code sections are compiled. When the VM starts running compiled
code instead of interpreted code, performance should begin to move
asymptotically towards that of compiled code, diluting any performance
penalties from the interpreter to small performance variations.
No Measurable Benefit.
Please see the results files attached in the bug page. This change was
adapted for x86 and sparc, and interpreter performance was measured with
Specjvm98 (run with -Xint). No significant decrease in performance was
observed.
Memory footprint and code complexity.
Stack caching in the JVM is implemented by switching the instruction
look-up table depending on the tos (top-of-stack) state. At any moment
there are is an active table consisting of one dispatch table for each
of the 10 tos states. When we enter a safepoint, we copy all 10
safepoint dispatch tables into the active table. The additional entry
code makes this copy less efficient and makes any work in the
interpreter harder to debug.
If we remove this optimization, we will:
- decrease memory usage in the interpreter,
- eliminated wasteful memory transactions during safepoints,
- decrease code complexity (a lot).
Please let me know what you think.
Thanks,
Max
More information about the hotspot-dev
mailing list