Integrated: 8290688: Optimize x86_64 nmethod entry barriers

Fri Jul 22 14:45:19 UTC 2022

On Wed, 20 Jul 2022 11:51:08 GMT, Erik Österlund <eosterlund at openjdk.org> wrote:

> The current x86_64 nmethod entry barrier is good, but it could be a bit better. In particular, this enhancement targets the following ideas.
> 
> 1. The alignment of the cmp instruction is 8 bytes. However, we only patch 4 bytes and the instruction length is always 8 bytes. So if we align the start of the instruction to 4 bytes only, that is enough to ensure that the immediate part of the instruction is 4 byte aligned, which is all we need (cf. http://cr.openjdk.java.net/~jrose/jvm/hotspot-cmc.html).
> 
> 2. Today the fast path (conditionally) jumps over a call to a stub. It is not uncommon for the branch not taken path being better optimized, making it favourable to move the call to a stub out-of-line. This has the additional benefit of not polluting the instruction caches at the nmethod entry with instructions not used in the fast path. A bit messy but we can do it for at least C2 code.
> 
> 3. For C1 and native wrappers, I don't think they are hot enough to warrant the stub machinery. But at least the jump that jumps over the cold stuff, can be shortened. I can get behind that.
> 
> Before addressing this, turning nmethod entry barriers on with G1 (e.g. by enabling loom) leads to a regression in DaCapo tradesoap-large. With this enhancement, the regression goes away, so that the cost of nmethod entry barriers is not visible.

This pull request has now been integrated.

Changeset: b28f9dab
Author:    Erik Österlund <eosterlund at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/b28f9dab80bf5d4de89942585c1ed7bb121d9cbd
Stats:     162 lines in 16 files changed: 147 ins; 1 del; 14 mod

8290688: Optimize x86_64 nmethod entry barriers

Reviewed-by: kvn, rrich

-------------

PR: https://git.openjdk.org/jdk/pull/9569