Discussion: 8172978: Remove Interpreter TOS optimization
Daniel D. Daugherty
daniel.daugherty at oracle.com
Thu Feb 16 15:40:49 UTC 2017
Hi Max,
Added a note to your bug. Interesting idea, but I think your data is
a bit incomplete at the moment.
Dan
On 2/15/17 3:18 PM, Max Ockner wrote:
> Hello all,
>
> We have filed a bug to remove the interpreter stack caching
> optimization for jdk10. Ideally we can make this change *early*
> during the jdk10 development cycle. See below for justification:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8172978
>
> Stack caching has been around for a long time and is intended to
> replace some of the load/store (pop/push) operations with
> corresponding register operations. The need for this optimization
> arose before caching could adequately lessen the burden of memory
> access. We have reevaluated the JVM stack caching optimization and
> have found that it has a high memory footprint and is very costly to
> maintain, but does not provide significant measurable or theoretical
> benefit for us when used with modern hardware.
>
> Minimal Theoretical Benefit.
> Because modern hardware does not slap us with the same cost for
> accessing memory as it once did, the benefit of replacing memory
> access with register access is far less dramatic now than it once was.
> Additionally, the interpreter runs for a relatively short time before
> relevant code sections are compiled. When the VM starts running
> compiled code instead of interpreted code, performance should begin to
> move asymptotically towards that of compiled code, diluting any
> performance penalties from the interpreter to small performance
> variations.
>
> No Measurable Benefit.
> Please see the results files attached in the bug page. This change
> was adapted for x86 and sparc, and interpreter performance was
> measured with Specjvm98 (run with -Xint). No significant decrease in
> performance was observed.
>
> Memory footprint and code complexity.
> Stack caching in the JVM is implemented by switching the instruction
> look-up table depending on the tos (top-of-stack) state. At any moment
> there are is an active table consisting of one dispatch table for each
> of the 10 tos states. When we enter a safepoint, we copy all 10
> safepoint dispatch tables into the active table. The additional entry
> code makes this copy less efficient and makes any work in the
> interpreter harder to debug.
>
> If we remove this optimization, we will:
> - decrease memory usage in the interpreter,
> - eliminated wasteful memory transactions during safepoints,
> - decrease code complexity (a lot).
>
> Please let me know what you think.
> Thanks,
> Max
>
More information about the hotspot-dev
mailing list