Integrated: 8353558: x86: Use better instructions for ICache sync when available
Aleksey Shipilev
shade at openjdk.org
Thu Apr 24 07:01:10 UTC 2025
On Wed, 2 Apr 2025 18:42:03 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
> For Leyden, that wants to load a lot of code as fast as it can, code cache flush costs are now significant part of the picture. There are single-digit percent startup time opportunities in better ICache syncs.
>
> It is not sufficiently clear why icache flushes are needed for x86. Intel/AMD manuals say the instruction caches are fully coherent. GCC intrinsic for `__builtin___clear_cache` is empty. It looks that a single serializing instruction like `cpuid` might be OK for the entire flush to happen, this is what our `OrderAccess::cross_modify_fence` does. Still, we can maintain the old behavior by flushing the caches smarter: there are CLFLUSHOPT and CLWB available on modern x86.
>
> See more discussion and references in the RFE. The performance data is in the comments in this PR.
>
> Additional testing:
> - [x] Linux x86_64 server fastdebug, `all`
> - [x] Linux x86_64 server fastdebug, `all` + `X86ICacheSync={0,1,2,3,4}`
This pull request has now been integrated.
Changeset: 188c2360
Author: Aleksey Shipilev <shade at openjdk.org>
URL: https://git.openjdk.org/jdk/commit/188c236071fd573a9ef35c34126443c6982a4f53
Stats: 247 lines in 16 files changed: 210 ins; 15 del; 22 mod
8353558: x86: Use better instructions for ICache sync when available
Reviewed-by: kvn, adinn
-------------
PR: https://git.openjdk.org/jdk/pull/24389
More information about the hotspot-dev
mailing list